linux - Fastest way to delete files from a directory tree whose names contain a certain string -
i have directory containing subdirectories delete files names contain out
. fastest method of doing this?
i have tried several things.
a simple:
rm */*out*
perl:
perl -e 'for ( <*/*out*> ) { ( (stat)[9] < (unlink) ) }'
each of seem take serious amount of time. 1,000 subdirectories, each of contain around 50 files matching *out*
, takes:
perl: ~25 mins rm */*out* : ~18 mins
i tried rsync
, moving files folder first , syncing delete, took ages.
does have faster way of getting rid of these files, seems inordinately slow me?
i find test3
fastest (11-25 sec). why not test yourself?
your filesystem can have big impact on performance.
the test uses gnu parallel.
# make test set: 150000 files, 50000 named *.seq testset() { doit() { mkdir -p $1 ; cd $1 && parallel --results ./{} seq ::: {1..50}; } export -f doit seq 1000 | parallel --bar doit >/dev/null # drop caches before starting test echo 3 | sudo tee /proc/sys/vm/drop_caches >/dev/null } export -f testset # define tests test1() { find . -name '*seq' | perl -ne 'chop;unlink' } export -f test1 test2() { find . -name '*seq' -delete } export -f test2 test3() { find . -name '*seq' | parallel --pipe -n1000 -q perl -ne 'chop;unlink' } export -f test3 test4() { find . -name '*seq' -print0 | xargs -0 -p2 rm } export -f test4 test5() { find . -name '*seq' -print0 | xargs -0 rm } export -f test5 test6() { find . -name '*seq' | perl -e 'chomp(@a=<>);unlink @a' } export -f test6 test7() { # sort inode ls -u -i */*seq* | sort -k1,1 -n| cut -d' ' -f2- | perl -e 'chomp(@a=<>);unlink @a' } export -f test7 # run testset/test? alternating eval parallel --joblog jl -uj1 ::: testset' 'test{1..7} # sort runtime sort -nk4 jl
Comments
Post a Comment