linux - Bash, splitting files from directory into groups by size (low peformance) -
i have problem bash script split files directory groups each group size 1gb.
i have script looks this:
#!/bin/bash path=$1 unset echo $path start fpath=`pwd`"/files" find "$path" -type f>files max=`wc -l $fpath | awk '{printf $1}'` while read file; files[i]=$file size[i]=$(du -s $file | awk '{printf $1}') ((i++)) echo -ne $i/$max'\r' done < `pwd`"/files" echo -ne '\n' echo 'sizes , filenames done' unset weight index groupid item in ${!files[*]}; weight=$((weight+${size[$item]})) group[index]=${files[$item]} ((index++)) if [ $weight -gt "$((2**30))" ]; ((groupid++)) filename in "${group[@]}" echo $filename done >euenv.part"$groupid" unset group index weight fi done ((groupid++)) filename in "${group[@]}" echo $filename done >euenv.part"$groupid" echo 'done'
it works, slow. can me , give me advice how make faster? thanks
below few suggestions, have not implemented them myself unable tell performance improvement hope give advice how make faster.
- the first loop can avoided when replace
weight=$((weight+${size[$item]}))
in second loop with:
size=$(du -s ${files[$item]} | awk '{printf $1}')
- the temporary file
files
can avoided when replace
for item in ${!files[*]}; do
with
find "$path" -type f | while read file
and replace ${files[$item]}
${file}
.
- checking size of files can avoided when instead of
find "$path" -type f
you use
find "$path" -type f -ls
and extract columns name , size.
Comments
Post a Comment