Tuesday, September 29, 2009

depending on svn externals? then piston is for you

Sometimes I worry that everytime i do an svn up with a repo that has externals defined i see code changing almost daily. should i be regression testing everytime i see that happen? what if the projects my project depends on break something? well you can put a proxy in between those external repos and your repo using piston. it will be the man in the middle between your repo and theirs (well the code is only mixed in your working copy but who's counting) I haven't tried it yet but it looks like you might be able to "freeze" an external or deploy a new checkout from your repo without being connected to the remote repo (as long as piston already has that remote repo cached)

http://piston.rubyforge.org/

there is even a post from this guy:

http://jfcouture.com/2007/12/12/the-guerilla-guide-to-git-how-to-start-using-git-even-if-your-team-is-on-svn/

who is using it to allow for svn externals in his git repo! so maybe you can intermix svn externals and git submodules?

attempting to fix svn (subversion) repo corruption using fsverify

If you suspect corruption in your repo

start by running the svnadmin verify command on your repo

svnadmin verify /Volumes/DATA/SubVersion/projectname

it will run through each and every commit starting at 0 through HEAD

* Verified revision 716.
* Verified revision 717.
* Verified revision 718.
* Verified revision 719.
* Verified revision 720.
* Verified revision 721.
svnadmin: Decompression of svndiff data failed

and possibly get cut off at a certain commmit with an error message (may or may not be like the above)

If and when this happens the best thing to do is to immediately restore your latest backup.

If somehow the latest backup is old and not up to date you have a few options. The best option would be to restore the latest backup and then choose whether or not you need worry about having record of any of the commits between the time that backup was created and the current HEAD. If you dont need any of the interim commit history you can just restore the backup and then do an svn check in to get everything back up to date.

If you do need some of that history you should start by making a backup of the corrupted repository.

Then try and repair the corruption (don't get your hopes up. sometimes it is possible. most times it is not)

The tool we will use to try and repair the repo is called fsfsverify and it is used in particular for readlength errors which subversion is surprisingly susceptible to.
read: http://www.szakmeister.net/fsfsverify/
grab: http://www.szakmeister.net/fsfsverify.tar.gz

so using the output from the svnadmin verify command above we expect our corruption to be in or around revision 722. we start by checking everything up to that point with fsfsverify:

./fsfsverify/fsfsverify.py -f /Volumes/DATA/SubVersion/projectname/db/revs/722

in the above we run fsfsverify.py with the -f (fix) option up to the revision in question at 722 we expect corruption... you may want to run it all the way to the latest revision or HEAD if you don't know what commit is the corrupted commit. NOTE USING FSFSVERIFY with -f MAY CAUSE MORE CORRUPTION you can just scan for corruption without the -f switch. BE SURE TO BACKUP YOUR REPO.. EVEN YOUR COURRPUTED REPO BEFORE USING THE -f switch.

here are some errors you might see if your repo IS corrupted

Traceback (most recent call last):
File "./fsfsverify/fsfsverify.py", line 1120, in
for noderev in strategy:
File "./fsfsverify/fsfsverify.py", line 839, in _nodeWalker
for x in self._nodeWalker():
File "./fsfsverify/fsfsverify.py", line 839, in _nodeWalker
for x in self._nodeWalker():
File "./fsfsverify/fsfsverify.py", line 832, in _nodeWalker
noderev = NodeRev(self.f, self.currentRev)
File "./fsfsverify/fsfsverify.py", line 723, in __init__
self.dir = getDirHash(f)
File "./fsfsverify/fsfsverify.py", line 492, in getDirHash
raise ValueError, "Expected a PLAIN representation (%d)" % f.tell()
ValueError: Expected a PLAIN representation (14899)

If it think's (think being the keyword... it might not make things better at all) it fixed things it might show:

NodeRev Id: 4bn.0.r723/45401
type: file
text: DELTA 723 3991 907 2209 33d818571849f2eb34a7d872be1a5639
cpath: /lib/filter/doctrine/base/BasePageTemplateMapFormFilter.class.php
copyroot: 0 /


NodeRev Id: 4bm.0.r723/45587
type: file
text: DELTA 723 17170 813 1909 faefab79ab1c9b61b8c7ae9297b97127
cpath: /lib/filter/doctrine/base/BaseModulePageFormFilter.class.php
copyroot: 0 /

Copy 7 bytes from offset 17744
Write 7 bytes at offset 17190
Fixed? :-) Re-run fsfsverify without the -f option

it is possible that it fixed your issue but it is likely that it did not. you can check this by running the svnadmin verify command again:

svnadmin verify /Volumes/DATA/SubVersion/projectname

again it will run through each and every commit starting at 0 through HEAD

* Verified revision 716.
* Verified revision 717.
* Verified revision 718.
* Verified revision 719.
* Verified revision 720.
* Verified revision 721.
svnadmin: Decompression of svndiff data failed

again we see we have the same issue. SO it's time to give up on keeping that revision. it's corrupted and we don't have a backup of it so we cut our losses. CUT being the keyword as we are literally going to slice that bad revision using some nimble svndumps of all the commits around it and then merging them all back together without that corrupted revision in them. then we load the new merged dump of all the good commits back into a NEW repo and start over (only after backing up the new repo of course. you should have been doing more backups and never had to deal with this in the first place!) read my previous few blog posts for how to use svndump to slice the corrupted repo into pieces and then bring them back together without corruption.

dealing with subversion corruption: dumping good segments of the repo while slicing out corrupted portions

svnadmin dump /path/to/repo/projectname/ -r 0:721 > projectname_r0to721.dump

in the above example the corrupted commit is 722 and 723 so we dump everything up to 722 into one file and then:

svnadmin dump ./usmagazine/ --incremental -r 724:724 > usmagazine_r724.dump

everything from 724 with the --incremental switch

and then everything from 726 to head (725 was also corrupted) again with the --incremental switch:

svnadmin dump ./usmagazine/ --incremental -r 726:HEAD > usmagazine_r726.dump

dealing with subversion corruption: merge two svndump (subversion dump) flat files using svndumptool

grab:http://svn.borg.ch/svndumptool/0.5.0/svndumptool-0.5.0.tar.gz
read:http://svn.borg.ch/svndumptool/

start by checking the dumps you made (with svnadmin) for validity (make sure you didn't include the corrupted commits in your dump)

$ python ./svndumptool-0.5.0/svndumptool.py check -A projectnamefull.dump
Checking file projectnamefull.dump

Traceback (most recent call last):
File "./svndumptool-0.5.0/svndumptool.py", line 116, in
sys.exit( func( appname, args ) )
File "/Volumes/DATA/Staff/jessesanford/svndumptool-0.5.0/svndump/tools.py", line 523, in svndump_check_cmdline
if check.execute( filename ) != 0:
File "/Volumes/DATA/Staff/jessesanford/svndumptool-0.5.0/svndump/tools.py", line 241, in execute
while dump.read_next_rev():
File "/Volumes/DATA/Staff/jessesanford/svndumptool-0.5.0/svndump/file.py", line 474, in read_next_rev
self.__skip_empty_line()
File "/Volumes/DATA/Staff/jessesanford/svndumptool-0.5.0/svndump/file.py", line 132, in __skip_empty_line
raise SvnDumpException, "expected empty line, found '%s'" % line
svndump.common.SvnDumpException: expected empty line, found ''



the above dump DID include corruption as you can see the python script bombs... what you should see is:

$ python ./svndumptool-0.5.0/svndumptool.py check -A projectname_r0to721.dump

Checking file projectname_r0to721.dump
OK

Once you are satisfied that all your dump files are clean (corruption free) and ready to be merged back into one run:

$python ./svndumptool-0.5.0/svndumptool.py merge -iprojectname_r0to721.dump -iprojectname_r724.dump -iprojectname_r726.dump -oprojectname_merged.dump

you can see that this merges dump files for a few different segments of revisions. in the above example revision 722,723,725 were corupted so i had to slice the repo into 3 different sections and then merge them all back together to minimize the loss of version history.

load data from svndump flatfile into new repo

svnadmin load /Volumes/DATA/SubVersion/newprojectname < projectname_merged.dump

check your svn (subversion) repo for corruption

svnadmin verify /Volumes/DATA/SubVersion/projectname

modify subversion commit message for particular commit

read: http://subversion.tigris.org/faq.html#change-log-msg

there are a few ways to do this. the following requires you have shell access to the server with the repo on it.

svnadmin setlog /Volumes/DATA/SubVersion/projectname/ -r 2297 ./tmppropmessage.txt --bypass-hooks

you have to do this on the local machine with the subversion repo available via a normal filesystem path. you have to pass the message in as text file. strange that you cant pass a -m "message" param.

$ cat tmppropmessage.txt
refactored the blah blah blah commit message goes here.

NOTE the --bypass-hooks option should be used with care. there are sometimes things in the pre-revprop-change hook script that are important (like emailing an administrator letting them know that you changed a property)

if you don't pass the --bypass-hooks option you may have to deal with whatever logic is in the /path/to/repo/projectname/hooks/pre-revprop-change.tmpl

here is an example of that hook (note i trimmed out all the comments):

$ cat ./hooks/pre-revprop-change.tmpl

REPOS="$1"
REV="$2"
USER="$3"
PROPNAME="$4"
ACTION="$5"

if [ "$ACTION" = "M" -a "$PROPNAME" = "svn:log" ]; then exit 0; fi

echo "Changing revision properties other than svn:log is prohibited" >&2
exit 1

as you can see the above logic makes sure the user is not trying to change properties OTHER than the log. so changing our log message would have worked without the --bypass-hooks switch in this instance.

dumping svn (subversion) repositories to flat file for backup.

svnadmin dump /Volumes/DATA/SubVersion/projectname/ -r 0:HEAD > ~/projectname_svn_9-10-2009_a.dump

Wednesday, September 23, 2009

bash script to push (if trigger is in place) an application directory and the corresponding app webroot and the app db with excludes lists for all 3

This script uses rsync and the mysql maatkit toolset: http://www.maatkit.org/doc/mk-table-sync.html

Note the triggers in the below script are sort of backwards. really you should write the trigger to the file system via some web accessible script. then when the trigger is available if the below script is cronned it will realize the trigger is there then sync everything and then delete the trigger when its done... then the script will wait for someone to "pull the trigger" and write out another trigger file to cause the script to sync again.

[root@usweekly-qa-app jsanford]# cat /usr/local/bin/pushWebroot.sh
#!/bin/bash
## This script is using key-based authentication ##
##################################
##### User-editable Variables ####
##################################
source_path=/usr/local/apache2/htdocs
target_path=/usr/local/apache2/htdocs
webexcludes=/usr/local/apache2/htdocs/config/rsync_web_excludes.txt
htdocsexcludes=/usr/local/apache2/htdocs/config/rsync_htdocs_excludes.txt
#excludes=''
user=rsync_username
privatekey=/path/to/private/key/for/above/rsync/username_dsa
testing='0'
servers='
ip.for.first.slave.app.server
ip.for.second.slave.app.server'
dbmaster='ip.for.db.master'
dbslave='ip.for.db.slave'
database='databasename'
dbmasteruser='databaseuser'
dbmasterpass='databasepassword'
dbslaveuser='slavedbuser'
dbslavepass='slavedbpassword'
triggerfiles=`ls $source_path/web/trigger/`
dbexcludes='comma,delimited,list,of,table,names,to,exclude,make,sure,to,preface,them,with,the,database,name'
################################################
##### Do not edit anything below this line! ####
################################################

################################################
######### Determining Sync Settings ############
################################################

if [ ! -e $source_path/web/trigger/$triggerfile ]; then
## If the trigger file doesn't exist, script exits ##
echo "There is no trigger file at $source_path/web/trigger/$triggerfile"
echo "No content will be published"
exit
else
source=$source_path/
target=$target_path/
websync='0'
htdocssync='0'
for triggerfile in $triggerfiles; do
if [ "$triggerfile" == "htdocs" ]; then
echo "We're syncing all content under htdocs"
htdocssync='1'
elif [ "$triggerfile" == "web" ]; then
echo "We're syncing htdocs/web root only"
websync='1'
websource=$source_path/web/
webtarget=$target_path/web/
elif [ "$triggerfile" == "db" ]; then
echo "We're syncing the database"
dbsync='1'
else
echo "nothing was specified or an error occurred"
htdocssync='0'
websync='0'
dbsync='0'
exit
fi
done
echo "sources: $source"
echo "targets: $target"
echo "websources: $websource"
echo "webtargets: $webtarget"
echo "Databases Sync=$dbsync"
echo "Web Sync=$websync"
echo "Htdocs Sync=$htdocssync"
##################################
### Determining Testing Mode #####
##################################

if [ "$testing" = "0" ]; then
dryrun=""
echo "#######################################"
echo "### We are NOT running in test mode ###"
echo "### Content will be replicated ########"
echo "#######################################"
else
dryrun="--dry-run"
echo "##########################################"
echo "### We are running in test mode ##########"
echo "### no content will be replicated ########"
echo "##########################################"
fi
echo $dryrun

##################################
### Defining Sync Sources ########
##################################
if [ "$htdocssync" = "1" ]; then
for server in $servers; do
echo "Starting content push to $server at `date`"
#echo "$server"
## See if servers are there and accepting connections ##
#echo "Hello $server"
#ssh root@$server hostname
echo "`date`"
echo "syncronizing $source to $server:$target"
/usr/bin/rsync -avzC --force --delete --progress --stats $dryrun --exclude-from=$htdocsexcludes -e "ssh -ax -i $privatekey" $source $user@$server:$target
done
else
echo "No static htdocs Content will be synced"
fi

##################################
### Defining web Sync Sources ##
##################################
if [ "$websync" = "1" ]; then
for server in $servers; do
echo "Starting content push to $server at `date`"
#echo "$server"
## See if servers are there and accepting connections ##
#echo "Hello $server"
#ssh root@$server hostname
echo "`date`"
echo "syncronizing $websource to $server:$webtarget"
/usr/bin/rsync -avzC --force --delete --progress --stats $dryrun --exclude-from=$webexcludes -e "ssh -ax -i $privatekey" $websource $user@$server:$webtarget
done
else
echo "No static web Content will be synced"
fi


##################################
### Database Sync ################
##################################
if [ "$dbsync" = "1" ]; then
echo "Replicating mysql database from $dbmaster to $dbslave"
####################################################
### We are using mk-table-sync instead of SQL Yog ##
### Uncomment one of the two lines below only ######
####################################################
mk-table-sync --execute $dryrun --verbose --ignore-tables $dbexcludes --databases $database u=$dbmasteruser,p=$dbmasterpass,h=$dbmaster u=$dbslaveuser,p=$dbslavepass,h=$dbslave
#/root/sqlyog/sja /root/sqlyog/usmagazine_prod.xml
else
continue
fi

echo "Finishing content push at `date`"
## echo "Re-setting sync cycle:"
## echo "touch $source_path/web/trigger/$triggerfile"
## touch $source_path/web/trigger/$triggerfile

fi

exit 0

use cron and bash script rsync to synchronize two webroots (or any folder for that matter)

[root@cms ]# cat scripts/sync_cms_to_www1.sh
#!/bin/bash

echo syncing everything in webroot to www1 webroot
rsync -avzC --force --progress -e "ssh -i /keys/cron_dsa" --exclude-from=/usr/local/bin/scripts/rsync_excludes.txt /www/ username@slave_server_ip:/www/
echo done!

use sqlyog job agent to synchronize two mysql databases

cron an sja job to push changes from a master database to a slave database via a bash script:

Here is a link to the download page for the sqlyog job agent (SJA): http://www.webyog.com/en/downloads.php#sqlyog

Here is the xml file for the master to slave push job: (note the xml below has html encoded < and > characters so you might not be able to just copy and paste it)

[root@cms ]# cat scripts/sync_cms_db_to_www1_db.xml

<version="6.5">
<syncjob>
<abortonerror abort="no">
<fkcheck check="no">
<twowaysync twoway="no">

<host>localhost</host>
<user>username</user>
<pwd>password</pwd>
<port>3306</port>
<ssl>0</ssl>
<sslauth>0</sslauth>
<clientkey>
<clientcert>
<cacert>
<cipher>
<charset>
<database>databasename</database>

<target>
<host>slave_server_ip</host>
<user>username</user>
<pwd>password</pwd>
<port>3306</port>
<ssl>0</ssl>
<sslauth>0</sslauth>
<clientkey>
<clientcert>
<cacert>
<cipher>
<charset>
<database>databasename</database>
</charset></cipher></cacert></clientcert></clientkey></target>
<tables all="yes">
</tables></charset></cipher></cacert></clientcert></clientkey></twowaysync></fkcheck></abortonerror></syncjob>


Here is the bash script that runs the above job xml:

[root@cms ]# cat scripts/sync_cms_dbs_to_www1_dbs.sh
#!/bin/bash

echo Syncing cms dbs to www1 dbs...

/usr/local/bin/scripts/sja "/usr/local/bin/scripts/sync_cms_db_to_www1_db.xml" -l"/var/log/databasename_db_cms_to_www1_sync_log.txt" -s"/var/log/databasename_db_cms_to_www1_sync_session.xml"

echo Done!

bash script to tar gzip backup apache webroot (or any folder) with timestamp

[root@cms ]# vi scripts/backup_webroot.sh

#!/bin/bash

echo taring webroot
stamp=$(date --utc --date "$1" +%F)
tar -czf /backups/sitename_webroot_$stamp.tgz /path_to_webroot/*
echo done

backup script for mysql

simple mysqldump to file with date stamp


[root@cms]# vi scripts/backup_db.sh

#!/bin/bash
echo "dumping db"

stamp=$(date -u --date "$1" +%F)
mysqldump -u username --password=password databasename > /backups/databasename_db_backup_$stamp.sql

Monday, September 21, 2009

bash script to install yum on centos

I created this script to download all the required rpm packages and to install them automatically. you will need to check the repo for the most up to date package names.

[root@cms ~]# vi yum-rpm-install.sh

for file in \
gmp-4.1.4-10.el5.i386.rpm \
readline-5.1-1.1.i386.rpm \
python-2.4.3-19.el5.i386.rpm \
libxml2-2.6.26-2.1.2.i386.rpm \
libxml2-python-2.6.26-2.1.2.i386.rpm \
expat-1.95.8-8.2.1.i386.rpm \
python-elementtree-1.2.6-5.i386.rpm \
sqlite-3.3.6-2.i386.rpm \
python-sqlite-1.1.7-1.2.1.i386.rpm \
elfutils-0.125-3.el5.i386.rpm \
rpm-python-4.4.2-47.el5.i386.rpm \
m2crypto-0.16-6.el5.1.i386.rpm \
python-urlgrabber-3.1.0-2.noarch.rpm \
yum-metadata-parser-1.0-8.fc6.i386.rpm \
yum-3.0.5-1.el5.centos.5.noarch.rpm
do rpm -Uvh http://mirror.centos.org/centos-5/5.1/os/i386/CentOS/$file;
done

Hack cron bash script to keep specific URLS in the cache

script to "touch urls" so that they stay in the cache of a caching proxy. the urls are contained one on each line in a file called URLS.conf

[root@cms ~]# vi /root/touch_urls-www1.sh

#!/bin/bash

for i in `cat URLs.conf`
do curl -H "Host: www.refinery29.com" http://127.0.0.1$i -s >> /dev/null
done


Here is the URLS.conf:

[root@cms ~]# vi /root/URLs.conf

/index.php
/about.php
/contact.php

Thursday, September 10, 2009

One for Anthony: how to delete from the svn server while maintaining your local copy

In case you checked something in that you shouldn't have.

svn delete /path/to/file/name/in/your/working/copy --keep-local
svn ci /path/to/file/name/in/your/working/copy -m "removing files i shouldn't have checked in cause I'm a dummy"

Wednesday, September 09, 2009

Change subversion log messages without pre-revprop-change hook script installed

on the machine local to the repository execute the following:

svn admin setlog /Volumes/DATA/SubVersion/reponame/ -r 2297 (revision number) ./textfilecontainingmessage.txt --bypass-hook

the --bypass-hook being the switch that will allow you to get passed this message:

svnadmin: Repository has not been enabled to accept revision propchanges;