Friday, 13 November 2015

selective backup using find, grep, cpio

I find this script helpful to make fast backup copy of my unix/linux system.
What made the script special (at least to me) is that it will run across on SCO Unix, SUN Solaris and LINUX os.

cd /
find . -depth -print | \
 grep -v "files/dirs to be skipped" | \
 cpio -ovBc > /dev/st0

cpio utility is quite versatile and easy to use. pls refer to cpio mans for features.

================================================

I have several PCs and notebooks and Android phones from which I do e-banking transactions.. such as paying bills, paying creditcard, buying things from ebay, transfer money to studying kids etc etc... Every transaction's receipt is important... to keep track of which is which money is involved... but they are on several devices! ( i dont want to use cloud's drive : not now as yet).


to find all non-pdf files and put them into a text file my-non-pdf :

# find /home -depth -print | grep -v ".pdf" > my-non-pdf
   


to find all     pdf files and put them into a text file mypdf :

# find /home -name "*.pdf" > mypdf


oohhhh sambung lain kali.... coz dinner is ready,  wife is waiting for me.. :)

You could do this through find command,
find /public_html -mindepth 1 -iname "*.pdf" -type f > output-file
Explanation:
/public_html     # Directory on which the find operation is going to takesplace.

-mindepth 1      # If mindepth is set to 1, it will search inside subdirectories also.

-iname "*.pdf"   # Name must end with .pdf.

-type f          # Only files.

A related problem:

Speeding up a Shell Script (find, grep and a for loop)

Hi all,

I'm having some trouble with a shell script that I have put together to search our web pages for links to PDFs.

The first thing I did was:


Code:
ls -R | grep .pdf > /tmp/dave_pdfs.out

Which generates a list of all of the PDFs on the server. For the sake of arguement, say it looks like this:

file1.pdf
file2.pdf
file3.pdf
file4.pdf

I then put this info into an array in a shell script, and loop through the array, searching all .htm and .html files in the site
for the value:


Code:
# The Array
pdfs=("file1.pdf" "file2.pdf" "file3.pdf" "file4.pdf")

# Just a counter that gets incremented for each iteration
counter=1

# For every value in the array
for value in ${pdfs[@]}
do

# Tell the user which file is being searched for, and how far along in the overall process we are.
echo "Working on $value..."
echo "($counter of ${#pdfs[*]})"

# Add what is being searched for to the output file
echo "$value is linked to from" >> /tmp/dave_locations.out

# Find all .htm and .html files with the filename we are looking for in, and add it to the output file
find . -name "*.htm*" -exec grep -l $value {} \; >> /tmp/dave_locations.out

# Adding a space afterwards
echo " " >> /tmp/dave_locations.out

# Increment the counter.
counter=`expr $counter + 1`

done

This does work.

However, our site is huge (1491 PDFs, and a whole lot of .htm and .html pages). Each iteration through the loop
takes around about 55 seconds. I've calculated that this shell script will take 6 days to complete.

Does anyone please know of a better (and significantly faster) way of doing this?

Any help would be greatly appreciated. I'm a bit of a unix newbie, and it took me hours just to get this far.
Sponsored Links

No comments:

Post a Comment