Suggestions on "how to do it better"

Shawn K. O'Shea shawn at eth0.net
Mon Jul 23 11:45:22 EDT 2007

Well, I'm still a pretty new python programmer, and I recently wrote a
fairly simple program, but one part in particular keeps nagging at me
that there must be a better way. I figured I'd ask here and see what
you guys thought.

The goal of the program is to look at a directory of files, pair out
some we don't want, then take the remaining list, and determine how
best to fit them on DVDs.

Here's the main snippets of code. The part that really bothers me is
the while loop. I basically have a function get me a list of "what's
left that can fit on the current dvd?". I use it as the loop control
mechanism (an empty list will get us out), but then need to call it
again inside the loop so I can use the list. IIRC, perl will let you
do an assignment as part of an evaluated expression, and also use
value assigned as the expression to be evaluated (so something like if
a = somefunction() and the 'a' gets assigned that value, and that
value is also evaluated for the if).

Anyway, here's some code:
import os
import re

dvdsize = 4700000000  # roughly. good enough in my case
filelist = os.listdir(os.getcwd())  # get the list of files in this dir

webs = {}   # web related stuff that we want goes in here
# Pair out the files we don't want based on prefixes in our naming scheme
# Store them in a hash table, key=file size in bytes, value=filename
for f in filelist:
  if not re.match('^prefixA|prefixB|prefixC',f):
     statinfo = os.stat(f)
     webs[statinfo.st_size] = f

dvdnum = 1
dvdlist = {}

def leftovers(somelist, boundary):
    morsels = [ s for s in somelist.keys() if s < boundary ]
    return morsels

# contents will hold a hash table of the files for the current DVD
# dvdlist is an array of these hashes for the whole dvd set
while webs:
   remaining = dvdsize
   contents = {}
   while leftovers(webs,remaining):
      stuff = leftovers(webs,remaining)
      maxleft = max(stuff)
      contents[maxleft] = webs[maxleft]
      del webs[maxleft]
      remaining = remaining - maxleft
   dvdlist[dvdnum] = contents
   dvdnum += 1

for dvd in dvdlist:
    print "Web archives DVD #",dvd
    total = 0
    for archive in dvdlist[dvd]:
        total += archive
        print dvdlist[dvd][archive]
    print "~",total/1000/1000,"MB -",len(dvdlist[dvd]),"files"


