Podcasts from the command line.

I work from my home office, so I don’t have to listen to what the guy in the cubicle next to me likes.  That’s good and bad, but in my case it’s a moot point – my office in the basement can barely pick up any local radio stations.  Just a few short years ago I would have had to resort to a collection of CDs or tapes (or running a long set of speaker wires from the livingroom radio down to the office).  Thankfully, the technology came about and rescued me from boredom of the same CDs on endless repeat – enter the Podcast.

From the Wikipedia entry, the term came about in early 2004.  I must have been right on the cusp, because it wasn’t too much after that time I was finishing our basement and ran into the entertainment problem.  Somehow I came across some tech related podcasts (DailySourcecode, TWiT), so I downloaded a few and played them through my laptop.  That all worked well but it meant each time I finished one, I had to take the laptop back up to the network connection (WiFi router died and hadn’t been replaced) and download the next one.  A podcast is nothing more than an MP3 file, so copying the files to the laptop is quick but still another step that I had to do manually to make sure I didn’t re-download a show I had already listened to.  After a couple evenings of this I started searching for a way to download them in the background when I was at work so I could have hours of un-interrupted geek-talk while working in the basement.

A quick bit of Googling lead me to BashPodder.  Since I was running Linux on my home system, this was a great fit.  (Though the BashPodder website says that it runs on many other OS’s including MacOSX, Windows, etc.)  There are only three real files you need to make it all work:

1: The bashpodder.shell script – this is the main program that retrieves the requested podcast files.

2: The parse_enclosure.xsl file – this is used by the script to extract the podcast file names and download URLs.

3: The bp.conf file – This is a simple text file containing a list of URLs pointing to some website feeds for their podcasts.

Download these files from the BashPodder website, or you’re welcome to use my tweaked version here.

Finally, to listen to them from the command line I wrote a script I cleverly call “Play And Delete” or “pad” for short.

Here’s the bashpodder.shell script I am currently using:

#!/bin/bash
# By Linc 10/1/2004
# Find the latest script at http://lincgeek.org/bashpodder
# Revision 1.21 12/04/2008 - Many Contributers!
# If you use this and have made improvements or have comments
# drop me an email at linc dot fessenden at gmail dot com
# and post your changes to the forum at http://lincgeek.org/lincware
# I'd appreciate it!
QUIET=-q
#QUIET=-v

#
if [ -e /var/tmp/bashpodder.FAIL ] ; then
	echo Will not run - /var/tmp/bashpodder.FAIL exists.
fi

# Make script crontab friendly:
cd $(dirname $0)

# datadir is the directory you want podcasts saved to:
datadir=$(date +%Y-%m-%d)

# create datadir if necessary:
mkdir -p $datadir

# Delete any temp file:
rm -f temp.log

# Read the bp.conf file and wget any url not already in the podcast.log file:
date >> ordered.log
while read podcast
	do
	file=$(xsltproc parse_enclosure.xsl $podcast 2> /dev/null | sed 's# #%20#g' || wget -q $podcast -O - | tr '\r' '\n' | tr \' \" | sed -n 's/.*url="\([^"]*\)".*/\1/p')
	for url in $file
		do
		echo $url >> temp.log
		if ! grep "$url" podcast.log > /dev/null ; then
			name=$(echo "$url" | awk -F'/' {'print $NF'} | awk -F'=' {'print $NF'} | awk -F'?' {'print $1'})
			# Fixes for different URLs that parse to incorrect file names.
			# Buzz Out Loud has the name first but it's a redirect URL...
			if [ $( echo $url | grep 'dl_dlnow$' | wc -l ) ] ; then 
				name=$(echo $url | awk -F? '{ print $1 }' | awk -F'/' '{ print $NF }')
				#echo FIXING: $url
				#echo NEWNAME: $name
			fi

			wget -t 10 -U BashPodder -c $QUIET -O $datadir/$name "$url"

			touch $datadir/$name
			echo "$url" >> ordered.log
		fi
		done
	done < bp.conf
EC=0
# Move dynamically created log file to permanent log file:
cat podcast.log >> temp.log || EC=1
cp podcast.log podcast.log.previous || EC=1
sort temp.log | uniq > podcast.log || EC=1
rm temp.log || EC=1
if [ $EC -gt 0 ] ; then
	echo FAILED to update podcast.log file. > /var/tmp/bashpodder.FAIL
	touch /var/tmp/bashpodder.FAIL
	exit 9
fi
# Create an m3u playlist:
ls $datadir | grep -v m3u > $datadir/podcast.m3u

# Misc cleanup
mv */*JPG /home/dan/Pictures/Backgrounds/

Most of the changes I have made were to fix problems on my system.  One update I made was to better handle a filled up my hard drive – this really got the BashPodder script all confused as to what to download.  The script writes a “podcast.log” file that it uses each time it runs to determine if it needs to download a podcast or not. If the podcast URL doesn’t exist in the podcast.log file, it downloads it and adds that URL to the file.  That works great until the drive fills up and it is unable to update this file.  In my case, the log file got erased so when I did free up space, BashPodder had to start over and tried to re-download everything.  (Some day I’ll document how I fixed that, but not today.)

My changes start at line 13. If the ‘magic’ bashpodder.FAIL file exists, it means there was a problem in a previous run and the system needs human intervention.

Line 30 adds a simple date to my log file named “ordered.log”.  I wanted to keep track of when a file was downloaded, so this helped me track that for later review.

Lines 38 through 50 are a mixture of original and new code.

  • Line 38 tries to pull out the file name that will be used later.  Some podcast URLs confuse the parsing done by the parse_enclosure.xsl template, so this helps lines 41 through 45 fix the name if necessary.
  • Line 47 was modified slightly to use the new name if necessary
  • Line 49 makes sure the date of the file matches the current system time.  The ‘pad’ script sorts the files by their timestamp so this keeps them accurate.

Lines 56 through 64 have a lot of additional error checking done on them.  If any one fails, the script creates the bashpodder.FAIL file mentioned earlier, then exits to let a human fix what’s wrong.

Line 69 is a hack, but it works for me.  Some URLs I have BashPodder monitor have backgrounds uploaded to them.  I have these files moved to my Backgrounds folder rather than manually moving them myself.  (I’m lazy, so sue me!)

The parse_enclosure.xsl file I use is un-changed from the official BashPodder version.

My listening is also done at the command line and using VLC to play the video or audio file.  After listening to a nights worth of 20-30 minute podcasts, I could have a number of files and directories to clean up.  I wrote my “Play And Delete” script to take care of tha for me.

#!/bin/sh
# VLC Options:
OPTIONS="--zoom=2"

clear
if [ -z `which vlc` ] ; then
	echo Could not find vlc: `which vlc`
	exit 1;
fi

FILE=$1
EXT=`echo $FILE | rev | awk -F\. '{ print $1 }' | rev`
echo Playing: $FILE \($EXT\)

# Set the size of the new VLC we open.
#
# Note: if file ends in .mp4, use a different size.
if [ "$EXT" = "mp3" ] ; then
  echo Resizing screen for $EXT extension.
  SIZING='0,0,1100,950,100'
  (sleep 2.0 ; wmctrl -i -r `wmctrl -l | grep VLC | awk '{ print $1 }'` -e $SIZING ) &
else
  echo "Not resizing an $EXT file."
fi

echo RUNNING: vlc $OPTIONS $FILE vlc://quit
vlc $OPTIONS $FILE vlc://quit 2> vlc.err
EC=$?
echo Exit code: $EC
if [ $EC -le 0 ] ; then 
    echo Deleting $FILE
    sleep 2
    rm $FILE
fi
rm -f `dirname $FILE`/*.m3u
rm -f `dirname $FILE`/.directory
rmdir --ignore-fail-on-non-empty `dirname $FILE`/../* 2>/dev/null

PAD basically takes a path/filename and tries to play the file with VLC.

Line 6 tries to confirm you have VLC installed and available in the path, otherwise it exits.

Line 12 gets the extension of the file (mp3, mp4, avi, etc) so lines 19-21 can move and re-size the vlc GUI to the lower-left corner of my screen.  I don’t resize video files, so if it isn’t an MP3 I don’t do anything.

Line 27 calls the VLC command to play the file.

Lines 28 through 34 monitor the exit code for VLC, and if it exited normally (i.e. got to the end of the podcast), then the script deletes the podcast from the disk.  This autocleanup is great, especially for some of the larger video podcasts that can be 200+ MB in size.

Lines 35 through 37 try to do some additional cleanup.  Since I don’t use an MP3 player, I don’t need the M3U files, and I also try to remove all of the empty directories.  (BashPodder saves the files into directories named for the year/month/day the download was performed.)

My bp.conf has a lot of additional entries.  I won’t clutter up this page with it, but if you’re interested in what I’m pulling down you’re welcome to contact me for a copy.  (I’ll give you a hint – I’m a big TWiT.tv fan – Hi Leo, Tom, Iyaz, Sarah, and Steve!)

A big thanks to Linc and his work on the initial BashPodder script.  Once I had that framework I was able to add and tweak it to fit my needs – I hope it helps others too.