Matthew Beckler's Home Page

Home :: Blog :: General :: Software :: Hardware :: Microcontrollers :: Artsy


Handy Scripts and Programs

Parallel Processing with find and xargs :: Run Parallel Jobs in Bash Shell :: Save an Audio Stream to Disk :: Automatic Pocket-Drive Backups :: Automatic Digital Camera Unloader :: UMN Resnet Auto-Login Script :: Gnuplot Script

You might also be interested in my Lightsaber Rotoscoping Scripts.


Parallel Processing with find and xargs

If you have to do a lot of processing of file-based, computationally-intensive jobs from the command line, you may be interested to learn that you can use xargs to schedule jobs in parallel. For example, if you have to run the command long_processing on all the .dat files in a directory tree, you can use this snippet of find and xargs:

find -name '*\.dat' -type f -print0 | xargs -P 4 -n 1 -0 long_processing

The -n 1 means to pass only one filename argument to each invocation of long_processing, and -P 4 means to use a maximum of four concurrent executions of long_processing. Adjust the number four to match the number of threads you actually want to run, probably no more than the number of CPU cores in your computer.

Check out this blog post at the Wayne and Layne Blog for more details and a useful application.

You should also check out GNU Parallel which does the same sort of thing, only better.


Run Parallel Jobs in Bash Shell

If you’re not interested in teaching yourself the glory and torture that is pthreads, but still have a lot of jobs to run, you might be interested in this. Frequently I need to do explorations of a design space by tuning different parameters and running a simulation, which makes for a lot of simulation runs. With all the multi-core machines available nowadays, it’s nice to run lots of jobs in parallel to reduce the waiting time, but you don’t want to run more jobs than available cores. This script will allow you to run as many jobs in parallel as you want, and starting the next job as soon as a running job finishes.

This might not work as well if your program's name is something generic like, say, a.out, or MATLAB, so you might have to be more creative in the line that check how many jobs are running. Maybe change the ps command to only include processes belonging to your username, or do something fancy with the grep regex.

#!/bin/bash
# Run processing jobs in the background, but only as many as you actually want.
# Useful for doing parameter searches of a problem space with simulation results.
# Customize the script, specifically the program name regular expression and the
# maximum number of jobs to run in parallel. You'll also need to edit the main for-
# loop's indices (you could even add more loops to search a 2D or 3D parameter space.)
#
# Written by Matthew Beckler (matthew at mbeckler dot org)

# put the first character in square brackets,
# to prevent the grep command from showing up
# in the process list with the same string as the program
PROGNAME_REGEX="[m]y_prog"

# set this equal to the number of cores, or however many
# jobs you want to run concurrently
MAXJOBS="4"

for i in `seq 1 10`; do
    # while we are running the max number of jobs, wait here
    while [ `ps aux | grep $PROGNAME_REGEX | wc -l` -ge $MAXJOBS ]; do
        sleep 1
    done

    # You can use the parameter $i to change the behavior or
    # settings of your executing program
    # The & at the end starts this program in the background
    ./my_prog $i &
done

Save an Audio Stream to Disk

In my hometown of Eau Claire, WI, there is a very cool community radio station called WHYS 96.3 FM. Since I don't live there anymore, I was pleased to see that they are streaming their programming using ShoutCast. There is a radio show on Monday evenings that I want to listen to, but I have other obligations scheduled during that time, so I had to come up with a way to record the stream to disk, and listen to it later.

The script below will record a stream using mplayer to a wav file, and then encode it to MP3 using the Lame encoder. Using the variables at the top of the file, you can configure the stream URL, duration of recording, output filename, and working directory. All output has been repressed so you can call it from Cron at the proper time.

In the code below, the call to mplayer is pretty lengthy, and has been split over two lines. In your script, you can just use one really long line, in which case you should remove the end-of-line backslash that was added below.

#!/bin/bash
# Save streaming audio to an mp3 file. Requires lame.
# This script will record for the duration entered below.
# Customize the stream URL, duration, filename, and working directory.
# 
# Written by Matthew Beckler (matthew at mbeckler dot org)

STREAM="http://1.2.3.4:8140"
DURATION="1:30:00"
DATESTR="$(date +%Y-%m-%d)"
# Don't include .wav or .mp3 at the end of the variable below:
FILENAME="Stream_dump_$DATESTR"
WORKINGDIR="/data/pub/audio/podcasts/"

cd $WORKINGDIR
rm -rf $FILENAME.{wav,mp3}
mplayer $STREAM -endpos $DURATION -vo null \
    -ao pcm:waveheader:file=$FILENAME.wav &> /dev/null
lame $FILENAME.wav $FILENAME.mp3 &> /dev/null
rm -rf $FILENAME.wav

Automatic Pocket-Drive Backups

This script will automatically mount, copy, and archive the contents of any device, such as a USB pocket drive. Make sure that the device is in the fstab, and mountable by regular users. If the script needed to mount the device, it will unmount it when it is finished. It creates a directory from today's date in the specified directory, where it copies all the files, then archives the entire folder, preserving permissions and directory structure, to a gzipped tar archive. It then cleans up the directory, leaving only the archive file. For my use, I created a directory called 'pocket_drive_backups' in my home dir, and put this script in it. The important variables are at the top of the script so you can easily modify this script yourself.

#!/bin/bash
# Script to automatically back-up a pocket drive
# Created a folder based on today's date, then tar and gzip the folder
# to a file of the same name .tar.gz
#
# Written by Matthew Beckler ( matthew @ mbeckler . org )

MOUNTPOINT='/mnt/usb'
BASE="/home/matthew/pocket_drive_backups/"
FOLDER="$(date +%Y%m%d)"
DIRECTORY="$BASE$FOLDER"
ARCHIVE="$FOLDER.tar.gz"
HADTOMOUNT=0

echo "Checking if device is mounted"
mount | grep -i $MOUNTPOINT &> /dev/null
if [ $? -eq 0 ]
then
        echo "Device is already mounted"
else
        echo "Mounting device . . ."
        mount $MOUNTPOINT &> /dev/null
        if [ $? -ne 0 ]
        then
                echo "Could not mount device"
                exit 1
        fi
        HADTOMOUNT=1
fi

echo "Checking if directory exists"
if [ -e $DIRECTORY ]
then
        echo "The directory '$DIRECTORY' already exists"
        exit 1
else
        echo "Creating directory: '$DIRECTORY' . . ."
        mkdir $DIRECTORY
fi

cd $BASE

echo "Copying files . . ."
cp -av $MOUNTPOINT/* $DIRECTORY

echo "Creating archive '$ARCHIVE'. . ."
tar -cvzpf $ARCHIVE $FOLDER

echo "Removing directory . . ."
rm -rf $DIRECTORY

if [ $HADTOMOUNT -eq 1 ]
then
        echo "Unmounting device . . ."
        umount $MOUNTPOINT
        if [ $? -ne 0 ]
        then
                echo "Could not unmount device"
        fi
fi

exit 0

Automatic Digital Camera Unloader

This script will automatically unload your digital camera's pictures. It is very similar to the Automatic Pocket-Drive Backup script above, but it merely creates a dated directory and copies over the pictures. You will have to customize the top three variables (MOUNTPOINT, SUBFOLDER, and PICTURESDIR) which will be different for everybody's computer and camera.

#!/bin/bash
# Script to automatically unload (without removing) pictures from a camera.
# Created a folder based on today's date, and copy all files over.
#
# Written by Matthew Beckler ( matthew @ mbeckler . org )
MOUNTPOINT="/mnt/usb"
SUBFOLDER="$MOUNTPOINT/dcim/101msdcf/"
PICTURESDIR="/home/matthew/pictures/$(date +%Y%m%d)"

HADTOMOUNT=0

echo "Checking if device is mounted"
mount | grep -i $MOUNTPOINT &> /dev/null
if [ $? -eq 0 ]
then
	echo "Device is already mounted"
else
	echo "Mounting device . . ."
	mount $MOUNTPOINT &> /dev/null
	if [ $? -ne 0 ]
	then
		echo "Could not mount device"
		exit 1
	fi
	HADTOMOUNT=1
fi

echo "Checking if source directory exists"
if [ ! -e $SUBFOLDER ]
then
	echo "The source directory $SUBFOLDER does not exist."
	echo "Are you sure this is a digital camera?"
	exit 1
fi

echo "Checking if destination directory exists"
if [ -e $PICTURESDIR ]
then
	echo "The directory '$PICTURESDIR' already exists"
	exit 1
else
	echo "Creating directory: '$PICTURESDIR' . . ."
	mkdir $PICTURESDIR
fi

echo "Copying pictures . . ."
cp -av $SUBFOLDER* $PICTURESDIR

echo "Finished copying pictures"

if [ $HADTOMOUNT -eq 1 ]
then
	echo "Unmounting device . . ."
	umount $MOUNTPOINT
	if [ $? -ne 0 ]
	then
		echo "Could not unmount device"
	fi
fi
exit 0

UMN Resnet Auto-Login Script

When I was living there (2004-2006), the University of Minnesota had a pretty bad residence hall network authentication system. For Linux and Mac users, who experienced frequent boots from the network, the only official response was to clear your dhcp cache and reboot. This script would automatically authenticate with the UMN Resnet system, which may have changed in the years since then. You will need to customize the USERNAME and PASSWORD variable. You might have to change COOKIEJAR to something that works for your machine. All output has been eliminated, so you can use this script in your crontab.

In the code below, the calls to curl are pretty lengthy, and have been split over multiple lines. In your script, you can just have one really long line, in which case you should remove the end-of-line backslashes that have been added below.

#!/bin/bash
# Script to auto-login to UMN Resnet
#
# Written by Matthew Beckler ( matthew at mbeckler dot org )

USERNAME=beck0778
PASSWORD=password
COOKIEJAR='/tmp/resnet-cookie'

wget http://www.google.com/ -O /dev/null &> /dev/null
if [[ $? = 1 ]]; then
    #could not access google.com - time to login
    curl https://resnet.netsec.umn.edu/fall2005/?original=http://www.google.com/\
        -F username=$USERNAME -F password=$PASSWORD -c $COOKIEJAR -L\
        -o /dev/null &> /dev/null
    #waiting 30-120 seconds - you might want to adjust this
    sleep 30
    #finalize the login
    curl https://resnet.netsec.umn.edu/fall2005/ -b $COOKIEJAR -L\
    -o /dev/null &> /dev/null
fi

exit 0

Gnuplot Script

Gnuplot is a pretty impressive open-source data plotting package. It's much easier to work with it if you create an external script to run, instead of having to type in commands each time. Here is an example script file that I keep around as a template for my lab reports. Since I'm using these images in a Latex document, I export them as Encapsulated PostScript (EPS), so you might want to change the file format to suit your needs. You can also use log axes, by using the set logscale x command, which also works for the y-axis. A quick note regarding the data csv file. Make sure to have spaces after the commas. The 1:5 business in the last line indicates which columns to use for the x and y data values.

set autoscale
unset log
unset label
#set logscale x #uncomment this to make a log plot
set xtic auto
set ytic auto
set title "Chart Title"
set xlabel "Independent Variable (Units)"
set ylabel "Dependent Variable (Units)"
set term post eps
set out "output_filename.eps"
plot "data.csv" using 1:5 notitle

You could save this script as, say, plot.script, and generate the plot using gnuplot plot.script.


Homepage Made with Vim! Validate HTML Email Me! Made with Inkscape! Validate CSS

Copyright © 2004 - 2024, Matthew L. Beckler, CC BY-SA 3.0
Last modified: 2010-07-26 08:55:33 AM (EDT)