Chiefs and Endians: 2010

Thursday, August 19, 2010

CMS: Concrete5 CEO, Franz Maruna, Stops By For A Visit

In our last post, CMS: Joomla vs Concrete5 -- First Look, we reviewed an up-and-coming Web CMS called Concrete5 and compared it to the well established giant, Joomla.

We were very honored to have the CEO and Founder of Concrete5 stop by to read the review and join in on the discussion going on in the comments section. We also had the pleasure of the President and CEO of Trivera starting off the discussion and sharing his own experiences with Concrete5.

It's great to see that we not only have industry professionals giving us their opinions, but we even get those who are closest to the tool chiming in as well. Let's try to keep up this open communication as we do some more in depth reviews in coming posts!

Tuesday, August 17, 2010

CMS: Joomla vs Concrete5 -- First Look

I've been delving into the fun world of web design lately and have been trying my hand at using a Content Management System to make it easier. Turns out, using a CMS makes the website simpler in some ways, but adds its own complexities. As I'm working between the different CMS's, I think it would be worth sharing some top level opinions for those who are trying to decide between them.

All 3 systems are open source and typically freely available on most web hosting services. If not, they are easy enough to install. I am not going to look at that right now. Perhaps in later posts we'll drill down further into the differences, but for now, I am just looking at the admin interface.

I started with Joomla, since it is best known, discarded it, tried Drupal, discarded it, played with WordPress, liked it but discarded it, and finally hit upon the lesser known Concrete5. For a while, I had hit upon a gem, made my site, then as I tried to add a few features, I found myself limited. I then began my way backwards through Drupal and eventually found myself with Joomla again. Each have their strengths and weaknesses...

Concrete5 is phenomenal in terms of interface for contributors. Setting up the page is rather simple; take a template you like and insert just a few php statements, then upload it to the Concrete5 backend. From there, using the front end you can just click around on the page itself (when it edit mode) and do wyswig style edits. You can add blocks of text, html, images, flash content, etc. Once added, if you want to change the layout, you can just grab the blocks and rearrange them. There is version control as well! Each edit versions that page and you can roll back to other versions.

Joomla on the other hand is much harder to setup. The backend is not as simple or user friendly/intuitive. Editing content is not as visual and requires work in the admin console. Once you get used to it, Joomla can become easy, but adding/modifying content is nowhere near as easy as it is in Concrete5.

So why did I switch back to Joomla? Concrete5 fails when it comes to extendability and community support. As I wanted to add advanced features to my site, like eCommerce shopping carts, I found that Concrete5 couldn't compete with the other CMS's available. They did offer modules to extend Concrete5, but at a cost. Now, I pursued an Open Source CMS enticed by the no-cost solution, so hearing that I have to pay to extend it turned me off. Joomla, Drupal, and WordPress blow Concrete5 out of the water in this arena. Not to say they are all free...All of them have paid and free extensions/modules/plugins. The difference is there is usually a free version of everything and paid versions for more advanced/professional features.

Drupal is a little weak compared to Joomla in terms of eCommerce, so I glossed over it in my search and moved back to Joomla. WordPress was a very good solution too, but I discarded it since it really is designed for blogging sites. There are ways to use WordPress for a non-blog site, but I didn't want to force the circle peg into the square hole. (Side note: this blog is hosted on Blogger.com, not a WordPress site). Joomla finally fit the bill here.

I'll try to talk in more detail about the features later, but since this is a blog, I'll have to curb myself here from going on and on. My quick summary:

1. Concrete5 is great for simple sites with content contributors that are not too technically savvy.

2. Joomla is great for a lot of different tasks, very flexible, very powerful, but pays for this all with added admin complexity.

Wednesday, August 11, 2010

Learn with Lynda.com Video Tutorials

I don't share links often as this blog is really just for me to post tips & tricks I learn on the job, but this link is amazing. A coworker told me about www.lynda.com when we were discussing learning how to use video editing software. This website has excellent video training on almost anything you can think of. Basically any of the major multimedia or office software has extensive training videos available: Adobe Photoshop, Premiere, After Effects, Microsoft Excel, Access, etc. They even have programming languages like PHP, Python, etc. I signed up for the 7 day free trial (Google it and you will find a link) and after just 20 minutes of watching videos I was blown away at the possibilities.

The link I found for the 7 day free trial is http://www.lynda.com/promo/trial/Default.aspx?lpk35=197. That is at the time of this post, but no guarantees that it will still be valid when you try it. If not, just Google it: "7 day free trial lynda.com"

Tuesday, August 10, 2010

LINUX: Sub-shells

I mentioned sub-shells before and the way the affect aliases, but I did not talk much about the use of sub-shells themselves. A sub-shell can be quite useful in scripting, either to isolate certain environments, or to run sets of commands in parallel.

Using them are simple: Encapsulate the commands you want in a sub-shell within parentheses.

#!/bin/bash

( echo "Start sub-shell"
export ENVVAR="This is my sub-shell variable" )

echo $ENVVAR

The first thing you will notice when running this script is that the environment variable we set within the sub-shell does not propagate back to the script. The echo of $ENVVAR will be blank. Sub-shells are unidirectional; they use one-way passing of the environment, so it can inherit your current environment, but will not give back anything. This is very useful if you need to do multiple things that all require separate environments.

#!/bin/bash

( echo "Start sub-shell 1"
echo $ENVVAR
export ENVVAR="1"
echo $ENVVAR )

( echo "Start sub-shell 2"
echo $ENVVAR
export ENVVAR="2"
echo $ENVVAR )

( echo "Start sub-shell 3"
echo $ENVVAR
export ENVVAR="3"
echo $ENVVAR )

From the output, you will see that the ENVVAR set in one sub-shell does not carry over to the other. Each environment is isolated. This technique can be used if you are doing multiple code builds in parallel, each with their own development environment, variables, compilers, etc. As it stands, the above executes each sub-shell serially, running one then the next. How about running in parallel? Add an ampersand "&" at the end of each sub-shell to indicate backgrounding the whole block.

#!/bin/bash

( echo "Start sub-shell 1"
echo $ENVVAR
sleep 2
export ENVVAR="1"
echo $ENVVAR ) &

( echo "Start sub-shell 2"
echo $ENVVAR
sleep 2
export ENVVAR="2"
echo $ENVVAR ) &

( echo "Start sub-shell 3"
echo $ENVVAR
sleep 2
export ENVVAR="3"
echo $ENVVAR ) &

I put sleep 2 in each block to slow it down so you can see the effects of backgrounding. What you will see is that it will execute all the echo commands to print "Start..." at the same time, then waits 2 seconds, then the numbers get printed later. The three sub-shells are running in parallel! Did you catch the problem in the above though? You were returned to the prompt prior to the script finishing. You received your prompt and then all the sudden, the numbers got printed out after. In order to tell the script to wait till all the sub-shells are finished before exiting the script, simply type "wait" at the end of the script. The final script will look like:

#!/bin/bash

( echo "Start sub-shell 1"
echo $ENVVAR
export ENVVAR="1"
echo $ENVVAR ) &

( echo "Start sub-shell 2"
echo $ENVVAR
sleep 2
export ENVVAR="2"
echo $ENVVAR ) &

( echo "Start sub-shell 3"
echo $ENVVAR
sleep 2
export ENVVAR="3"
echo $ENVVAR ) &

wait

Thursday, August 5, 2010

LINUX: Case Insensitive in VIM Editor

Just a quickie here. If you are in the VIM editor and want to search for a term, but do not want it to be case sensitive, type ":set ic" enter. This tells VIM to treat UPPER and lowercase the same.

Wednesday, August 4, 2010

Essential Effortless Efficiency

There's now a "lens" over on Squidoo that is geared towards collecting different technical time-saving tips from the community.

Head on over and contribute: http://www.squidoo.com/work-smarter-and-faster

LINUX: Creating Macros in VIM Editor

A fantastically efficient way to repeat a series of actions is to use a macro. You may be familiar with macros from common products like MS Word. They are quite easy to create and use.

The simple steps we are going to follow are:

1. Assign macro to a key
2. Start recording
3. Do a bunch of stuff
4. Tell VIM we are done recording
5. Use the macro!

To assign the macro to a key and start recording in one shot, what we do is type "q" followed by whatever key we want to set the macro to. Let's say here we will use "s". So typing "qs" will cause a message "recording" to appear in the bottom left, indicating that we have started recording a macro to store in "s". Don't worry, you wont accidentally run the macro, by typing "s" since macros are run by typing the "@" symbol first.

Now that we are recording, it is time to do some stuff. Let us say that we need to add a semicolon to the end of the line, and then move the cursor to the next line to prepare for running the macro again. Here is what you do:

1. Type "i" to go into insert mode
2. Press key to go to end of line
3. Type ";" since we are in the right place
4. Press the down arrow key to move to the next line
5. Press Escape key to tell VIM we are done inserting text
6. Press "q" one more time to stop recording (the "recording" message goes away)

Now the macro is set. Type "@s" to run the macro. Now, let's say we want to run it on 20 lines of code...Use the trick learned in a previous post to repeat the macro 20 times: "20@s". Viola!

Friday, July 30, 2010

LINUX: Repeat Command/Action N Times in VIM Editor

All too often I find myself redoing a command over and over. I've often said, there must be a better way to do this. And there is! While writing some automation scripts in an agile "paired-programming" method yesterday, I learned a few neat tricks for VI from my programming partner. I'll share them in the next few posts.

The first is repeating a command. If, for example, you want to delete 100 lines from your file in VIM. Pressing the "d" key twice deletes a single line. So I used to just keep typing dd 100 times till everything I want was deleted. If you know how many times you want to run the command, you just type the number first, then the command.

100dd

Typing the above in VIM will repeat the single line delete (dd) command 100 times. How else can it be use? It can even insert text repeatedly! Type the following

50iTHE END IS NEAR<escape><escape>

The 50i tells VIM that you are going to insert some text that you want repeated 50 times. The text "THE END IS NEAR" is what will be repeated. And pressing the escape key twice implements it. You would get:

THE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEARTHE END IS NEAR

Notice it is all one line. In order to make it appear on multiple lines, just press the enter key and then end of the text, before the escape. Very simple.

Thursday, July 29, 2010

LINUX: Alias and Script Subshells

This is an update to my last post on Scripts and Alias. Almost immediately after posting about the solution of enabling the expand_aliases shell option in scripts, I tried it in a more complicated script with a subshell, and it didn't work.

I'll make this post quick. In a script, you can run commands in a subshell (remember, the script itself is already in a subshell). This sub-subshell is not really a subshell, because if you echo the $SHLVL environment variable within it to see what shell level you are at, you will see that it is at the same level of the script. However, we are going to consider it a sub-subshell because it doesn't influence the rest of the script with what it does or sets. You run these sub-subshells by encapsulating them in parentheses. Eg:

#!/bin/bash
shopt -s expand_aliases
alias testme1="echo It Worked"
alias
testme1
( shopt -s expand_aliases
echo "I am in a subshell!"
echo "Can anyone hear me?"
alias testme2="echo Subshell Alias"
testme2 )

If you run this, you will see that testme1 works, while testme2 doesn't. Even putting the shopt -s expand_aliases within the sub-subshell still doesn't fix it.

SO, the solution? I went back to my old dirty hack, and that worked fine.

#!/bin/bash
shopt -s expand_aliases
alias testme1="echo It Worked"
alias
testme1
( shopt -s expand_aliases
echo "I am in a subshell!"
echo "Can anyone hear me?"
alias testme2="echo Subshell Alias"
`alias | grep testme2 | cut -d\' -f2` )

One caveat to point out: If you don't have an alias set, this will mess up things. I tried running a "make" command using my above hack, but in one case, the build environment didn't bother setting up an aliases "make", working fine with just a standard "make". When it tried my hack, it didn't find anything in the alias, so did nothing.

Tuesday, July 27, 2010

LINUX: Scripts and Alias

This one stumped me for a while; we had build scripts that kept failing, yet if you manually executed the commands from the script, everything worked fine. Eventually I discovered that it wasn't properly executing the "make" command as was setup in the alias. When we set up our build environment, we alias "make" to have a bunch of compiler flags so when you finish running the setupenv file (to set environment variables and such), all you have to do is run "make" and the alias keeps track of all the necessary build flags.

Scripts never work properly because they interpret your "make" command to be just that--"make" with no build flags. It turns out this is because scripts (non-interactive shells) (1) don't expand aliases and (2) don't bring in the current environment.

Concerning point 2 about the current environment: This is actually a good thing. Imagine if you have a special environment you set up and the script runs fine in your area, then someone else tries to run it in theirs and it fails. It would be a pain to begin debugging by comparing what is in your .bashrc file and what you did to customize your environment. A script should be self contained, able to run in any environment. So when you run it, it doesnt carry over your environment variables into its non-interactive subshell that it runs in.

We are more concerned with point 1 though: A shell script, at least on all the Fedora and RedHat servers I've worked on, do not expand the aliases by default. Before I came across the good solution on the internet, I hacked together my own solution of parsing the expanded command out of the alias output:

`alias | grep make | cut -d\' -f2`

That works, but is not an elegant solution by any means. Recently a coworker was having trouble with a build script and I got into explaining to him he may have a problem with a non-expanding alias. I showed him my solution and then gave the caveat that there is probably a better, easier solution out there. Then I looked for one and found it (don't ask me why I didn't find it the first time when I created my hack solution). Just place the following command in the begginning of your shell script:

shopt -s expand_aliases

What this does is sets the shell option (shopt) to enable (-s) the expansion of aliases (expand_aliases). The man page says that this option is enabled by default for interactive shells, so the implication (and reality we've witnessed) is that it is disabled by default for non-interactive shells.

If you want to see this in action, try this script:

#!/bin/bash
alias testme="echo It Worked"
alias
testme

The result will print the new alias you assigned (typing alias in line 3 prints out current aliases) for testme. Notice there are no other aliases set, not even the ones that are set globally for the system. The execution of the testme command fails with "command not found" because it is not recognizing the alias.

Now try again with the shopt set:

#!/bin/bash
shopt -s expand_aliases
alias testme="echo It Worked"
alias
testme

Now it will work perfectly, showing the alias we set and echoing the "It Worked" statement.

Wednesday, July 14, 2010

LINUX: The Power of HERE Documents

After doing things the manual, brute force way, you will be able to truly appreciate the elegance of HERE documents. If you wanted to create a file with a few lines of text in them, you would redirect the echo command to it. E.g.

echo "The old method" > temp.txt
echo "Appending a second line to the file >> temp.txt

Side note--a single redirect arrow, the "greater than" sign, will create a new file called temp.txt containing the text on the left hand side. The double redirect arrow as shown on the second line, will not create a new file (nor erase the contents of the existing one), but rather, it will append to it.

So that method is okay for doing a few small things, but it isn't efficient or powerful when your needs are greater. We can use a HERE document to put several lines of text into a file without having to continuously echo and redirect. A HERE document can be used for more that just this, but that is the focus of this post. So, the example is as follows:

cat > temp.txt << EOF
The Old Method
Appending a second line to the file
EOF

The temp.txt and EOF can be changed to whatever you want. The temp.txt is just a file to dump the text into. The EOF is a label to indicate the beginning and end of the HERE document. Notice that there is a matching EOF at the end. If you wanted to use the word HERE, that is perfectly fine, just make sure that at the end of the section to put into the file, you write a matching HERE to indicate the end of the block.

The end result of the two methods are the same, however the HERE document scales much better. You can put a 100 lines of text right in there if you desired with little effort.

There is even more power to be had here. We can do things like executing commands at run time when compiling the HERE document. E.g.

cat > tmp1 <<EOF
this is the first file
this is the 2nd line of 1st file
`echo hello`
did the above come out as "hello" or echo hello?
no, it executed the echo and just printed hello...nice
EOF

Using the technique of appending mentioned above, along with the HERE document, we can also grow our HERE document. E.g.

cat > tmp1 <<EOF
this is the first file
this is the 2nd line of 1st file
`echo hello`
did the above come out as "hello" or echo hello?
no, it executed the echo and just printed hello...nice
EOF

cat >> tmp1 <<EOF
did everything get deleted?
nope, it appended
EOF

The results of the above script would be (inside tmp1 file):

this is the first file
this is the 2nd line of 1st file
hello
did the above come out as "hello" or echo hello?
no, it executed the echo and just printed hello...nice
did everything get deleted?
nope, it appended

Tuesday, June 29, 2010

PERL: Removing Text Within Parentheses

I recently wanted to get a list of major landmarks, but the text list had the name of the landmark, followed by the location of the landmark in parentheses: ex. Eiffel Tower (Paris, France). I just wanted the names of the landmarks without the text in the parentheses, so had to figure out the command to remove all the parenthesized text.

Perl was the winner as tool of choice. There was a small trick to doing this seemingly trivial task, so document it we shall.

Original: Eiffel Tower (Paris, France)
Desired: Eiffel Tower

There are two ways to do it based on what you want:

perl -p -e 's#$.*$##g' textfile

You may have seen 's/oldtext/newtext/g' as the syntax before and are wondering why I am using hash marks (or pound signs) instead. You don't have to use the forward slash, it is just the common way, but if you want to use the forward slash in the search text without having to escape, using hash marks are the way. It can also be used to make it easier to read. Now, onto the command--the \( obviously says look for a left parenthesis, then there is the critical .* which says find any number of any characters. Finally, we close it off with a right parenthesis. This will find anything encapsulated by two parentheses.

perl -p -e 's#$[^)]*$##g' textfile

This solution will also do the same thing based on our Original text string, but it is slightly different. The [^)] is telling "any character that is not a right parenthesis." The carat (^) is negating everything in the brackets. This is useful if you are making an exclusion set. You can place several characters, [^$)?], and it will look for any character except a $, ), or ?.

Since the two commands work the same for the given example, let's show how the commands will vary in different situations:
If textfile contains

1. Paris (France,) Hilton (Hotel)

2. Paris (France (Hilton) Hotel)

Using

perl -p -e 's#$.*$##g' textfile

The results would be:

1. Paris

2. Paris

Note the danger here is that, even though in line 1 Hilton is not in parentheses, it gets removed because there is an ending right parenthesis at the end of the line. This may not be the expected/intended operation.

Next, using

perl -p -e 's#$[^)]*$##g' textfile

The results would be:

1. Paris Hilton

2. Paris Hotel)

The operation for line 1 may have been what we were expecting, but line 2 doesn't look good. The moral here is to understand what you are trying to do and choose the correct command to do the appropriate operation.

Thursday, June 24, 2010

HEALTH: Quasar Workouts

My new favorite method of exercise is Quasar Training. It is a method my friend and I made up recently in the gym to help "spice up" our gym routine. I can't say that the idea is novel--I think most everything has been thought of already--but it is new for us. We called it Quasar because it sounds awesome and motivating, but it really is QSSR training.

Quick Start, Slow Release (QSSR), aka Quasar, is a technique that follows exactly what the name implies. Start the exercise with a Quick Start effort, working the Fast Twitch muscles for speed increases. At the peak of the movement, right before you begin the return movement, you change over from "Quick" to "Slow". The Slow Release is returning in a very slow motion, about 2 "Mississippi Counts" worth.

In one exercise, you are increasing your explosive speed, and gaining strength. The muscle strength you gain is also being exercised differently. Since the return is often a quick release of muscle tension, most workouts rarely exercise the muscles in this fashion. Slow Release will allow quicker gains and a more complete muscular workout as you employ stabilizer muscles as well.

A great way to experience Quasar Training immediately is to a set of Quasar Push-Ups right now. Lay on the floor in Push-Up position, but with your chest on the floor. Launch yourself up as fast you can (Quick Start) and then lower yourself back down very slowly (Slow Release). Repeat for a complete set. Do not let your chest touch the ground between repetitions. Each time you are pushing up as fast as you can to build explosive speed, then you are returning down, fighting gravity very slowly.

This has been talked about on EzineArticles as well: http://ezinearticles.com/?Quasar-Workouts-Are-the-Brightest-Way-to-Stellar-Physical-Fitness&id=4523423

Wednesday, May 26, 2010

HEALTH: Interval Strength Training

If you do strength training, you probably know that your body can plateau and changing up your exercise pattern is a good way to kick your body back into gear. In an effort to do just such a kick start, my gym buddy and I often try new techniques. One such that we find very effective is what we call Interval Strength Training.

The term Interval Training is overused already, utilized for other techniques unrelated to what we are talking about here. While a better name is needed, when we use it in our workout routine, we always know what we mean by it.

You can call it what you want, but in context of this post, Interval Training is a method of simply changing the order of the weight changes in a set. Traditional methods do 3 sets of 10 repetitions (reps) of increasing strength: light, medium, heavy. It is also a good idea to always start with a warm up set that isn't counted in the above 3.

Interval Strength Training is a little different. We shock the muscle by jumping back and forth, and also change the reps per set. The first set is the traditional "light" set at 10 - 12 reps. Next, we jump up to our heaviest set at 4 - 6 reps. If you are able to do more than 6 reps on Heavy, than you have misjudged your weight selection and should fix it next time. The 3rd set is Medium; somewhere between the heavy and light, and should be 8 - 10 reps. Finally, we push out a set of what we call "Super Light" which is set below the Light set from before and we have to do 20 reps at this weight.

The method here allows for your muscles to work on heavier weights earlier in the workout before they fatigue, which will encourage muscle growth. The super light set at the end helps to build endurance as you push your muscles to fatigue. Our own application of this technique is to do 2 or 3 weeks traditional, then a week of Interval Strength Training to "change it up".

Give it a try and see the results. Feel free to report them here for others to benefit.

To Recap:

1. Light = 10 - 12 Reps
2. Heavy = 4 - 6 Reps
3. Medium = 8 - 10 Reps
4. Super Light = 20 Reps

Thursday, May 6, 2010

CVS: Using Perl to Restrict Commits

This is one of the most useful tricks I have used in terms of SCM controlling of a CVS repository, so listen closely.

CVS is a great tool, and a horrible tool all in once Open Source package. It can do a lot, but often it can do too much as well. In order to stand a chance of having a well controlled environment, it is often important to have CVS do some of the enforcement work for you. Let it be the bad guy instead of you.

First, check out the CVSROOT module from CVS:

$ cvs co CVSROOT

The most important file here is verifymsg. This file tells CVS which files to pay attention to when someone commits a file. In this file, we will tell CVS to restrict one file to a set of rules, and let another file slide no matter what. Take for example 2 subdirectories, one called GUI and one called TEST. Let's say we don't care about TEST code and can let developers change this stuff at will, but GUI is our production code that we want to keep tight control on. How do we control it?

We can either require them to submit some sort of change document like a bug id, problem report (PR), software change request (SCR), etc. OR we can put certain requirements on the content of the commit comment. The method I employed was to tie them to using SCR's that they had to create with a 3rd party tool called Serena ChangeMan Dimensions. It is a tool for Change Tracking (a critical SCM concept and we were able to do it by doing this hacked integration of CVS and Dimensions. A tighter, more seamless integration already exists with Bugzilla and CVS, but we are not always given flexibility to use the tools we want to use, so we figure out how to use the tools we have to use.

In any case, in order to require them to create an SCR in Dimesnions for each check in, we will need to do a few steps:

1)    Tell CVS which files to NOT-enforce on in verifymsg
2)    Write a perl script to check the commit message for an SCR number. This perl script is the key--if we wanted to make sure they wrote a full sentence in their message instead of checking for an SCR number, we could do that here instead too.
3)    Tell CVS which files we DO want to enforce checking on and point it to the perl script created above to verify the commit message.
4)    Commit the verifymsg file to CVS
5)    Ensure the script is executable by everyone.

Now to work on the perl script. A good place for it would be /usr/local/bin, but you can put it anywhere you want. This is going to be a perl script. Don't know what Perl is? It is not something precious made in clams and strung around a pretty girl's neck. It is a powerful scripting language. That is all I will say about it here. If you want to know details, visit http://perldoc.perl.org/index-tutorials.html or http://www.perl.com. My goal here is to just show the shortcuts to get the job done, without delving into the guts of any of this.

So we'll start with an empty file, call it verify-scr.pl, and place the following in the very first line of the file to declare it as a perl file:

#!/usr/bin/perl -w

Next we will place a BEGIN command to tell perl that this block of code should be run as soon as possible (even before the rest gets compiled). In this block we will set $valid=0, and at the end of the file, we will return this variable. This is what our code looks like so far:

#/usr/bin/perl -w
BEGIN
{
$valid = 0;
}

The purpose of this block is so that, even if there is a problem interpreting the rest of the perl script, we can still give something back for CVS to understand. Namely, we will immediately set the check to fail, so as a default if the script fails right now, the whole CVS commit will be denied. This is a safety precaution since we don't want developers to commit code that doesn't meet our requirements just because there is a typo in the perl script.

On to the meat of the perl script. Each line is being interpreted individually in this script. What we are analyzing is the text content of the CVS commit comment. We will encapsulate the check within a while loop and want to read the input from STDIN (which is the CVS commit comment that is being sent in). Type while (<>), then we will put the rest in curly braces. The next part is where perl shines--using a regular expression to find the SCR number and capturing it to a variable. In this example, I required users to type "SCR: " followed by the SCR number. The perl script will find that token SCR: and then look at the SCR id right after it.

if (/^SCR:\s*([A-Za-z0-9_]+[SCR|scr][_0-9]+)/)

Goodness that looks complicated, and I wont lie to you, it is! But let's simplify it. Obviously the if statement is just a simple common programming construct, so we'll skip it. The ^ indicates that we are looking for something at the beginning of the line. Without the ^ it will try to match the SCR: anywhere in the line. This is preference, but I believe they should have this SCR number declared in the beginning of a line, not embedded in the middle of a sentence. Much cleaner that way. We already answered the next part...the SCR: after the ^ is just an exact text string we are looking for. The \s is vital magic here; it tells perl that everything within the that matches the regular expression within the proceeding parenthesis will be stored in predefined variable $1. Ex. if the CVS commit comment is "SCR: PROJECT_SCR_22", it will parse out only the PROJECT_SCR_22 into $1. Now we tell perl how to find that PROJECT_SCR_22 string. The [A-Za-z0-9_] means find any combination of UPPERCASE or lowercase letters and numbers, all followed by an underscore. PROJECT_ matches, as well as ProJect_, or even MAcH5_. The + that follows actually doesn't mean "plus the following" as intuition leads you to believe--it actually means "at least one or more of the previous".

I will diverge for just a slight moment. What we are using in the above "if" statement is what is called a regular expression and is found in much more than just perl. Often the syntax is very similar, or seemingly exact, but keep in mind we are still doing perl here.

We left off looking for any string that contains at least one UPPERCASE, lowercase, number, or underscore. In my implementation, I have multiple SCRs for multiple projects, so there is ABC_SCR_1, XYZ_SCR_1, and JoeSmith_123_SCR_1. All are valid SCRs, so our perl regex (short way of saying regular expression) says to look for basically anything, followed by the string "SCR" in either UPPERCASE or lowercase. I didn't allow mixed case in this example. So the next part, right after the first + we saw, is the [SCR|scr]. As you may have guessed by my previous statements, this means find the term SCR or scr. The pipe operator "|" means or in this case. Again, only all UPPERCASE or all lowercase is matched, not mixed case. Notice that I did not put a + sign after this square bracketed part. The reason is I don't want to find SCR "one or more times" nor do I want to find it "zero or more times" (which is what * would mean)--I want to find it one time and exactly one time...no more, no less. At this point, we now have some string indicating the project, then we have the SCR keyword that we have to find, and finally we'll look for some number. That is the [_0-9]+ part; we look for any number, but again it has a + sign so we have to least have SCR followed by some number. It could be 3, 10, 42, or even 2395678542.

Once we close the parentheses, this tells the "s" which was placed before the first parenthesis that the string we just matched will be placed into $1. Note, if you had placed multiple instances of s(some_regex), they would incrementally be matched and placed into $1, $2, $3, etc. Place another close parenthesis to close off the "if".

Now that the hardest part is done, we'll breeze through the rest. Create an open curly brace to say that we are going to put a block of code that will be executed if the if statement above succeeds. In here we will do a bunch of things at once:

{
   my $scr = $1;
   print "We matched $scr from your commit comment\n";
   $valid = 1;
}

First line puts the matched string into a variable called $scr. The print line is just a message we are printing to screen to let the user know what we are doing. Since we used $1 in the print line, it will show them the string we matched. The \n just passes in a new line. We could get fancy at this point and wait for a user input to verify if we matched the right thing...but not in this post. Finally, we set the $valid variable that we used in the BEGIN statement. Remember that we assumed it is immediately false back in the BEGIN, but now we are setting it to TRUE. So if we get to this line and set this variable, when we exit the script, the CVS commit will succeed.

Close up the Curly braces now for the if statement and while statement, then put an "exit !($valid)" and we are all done. The final product looks like:

#/usr/bin/perl -w
BEGIN
{
  $valid = 0;
}
while (<>)
{
if (/^SCR:\s*([A-Za-z0-9_]+[SCR|scr][_0-9]+)/)
{
   my $scr = $1;
   print "We matched $scr from your commit comment\n";
   $valid = 1;
}
}
exit !($valid)

Our Perl script is done, so now to tell CVS when to use it. Edit the verifymsg file from within the CVSROOT module we checked out. At the end of the file, we are going to add another regex to catch commits and route them to the perl script. Going back to the example started at the beginning of this post, we need to put a line in this verifymsg file to tell CVS that all code in GUI needs to have an SCR, but all code in TEST does not. This is how:

^TEST /bin/true
^GUI /usr/local/bin/verify-scr.pl

Remember, we saved our perl script from above in the /usr/local/bin folder, so now we are calling it when the CVS commit matches the ^GUI line. The carot (^) is a regex notation for "the beginning of the line" so it is looking for GUI as the first word in the line. If GUI and TEST were part of a larger module, say CODE, then it would be ^CODE/GUI.

Security warning: The file /usr/local/bin/verify-scr.pl needs to be executable to all users (or at least those in the cvs group), but should not be editable to anyone. Make sure to change permissions accordingly.

Save and commit the verifymsg file. Keep in mind that CVS will match only the first instance in the verifymsg file, so if you have something that matches 2 regex lines in verifymsg, only the first one will be used.

WEB: Google Webmaster Tools, Sitemaps, and Blogger/Blogspot

Today I looked at my Google Webmaster Tools and once again puzzled as to why the sitemap kept showing an error. My site was not indexing properly and this was really bugging me. I submitted atom.xml as a sitemap and it failed. I submitted rss.xml as a sitemap, and still it failed. The failure is not as bizarre as the fact that I host 2 different blogs on blogspot.com and the old one works fine using atom.xml as a sitemap, but the new one (this one) refuses.

Finally, after some web searching, I stumbled upon an answer. Using "feeds/posts/full" as the address for the sitemap seemed to work.

So, if you are trying to get the sitemap to work in Webmaster Tools, follow these steps:

1)    Log into your Google Webmaster Tools account (http://www.gooogle.com/webmasters/tools).
2)    Click on your site--we are assuming you already created your account and verified the site
3)    Click on Sitemaps under Site Configuration
4)    Click on Submit a Sitemap
5)    At the time of this post, doing the above shows a box that has the base web address filled in and an empty text field to add the location of the sitemap. In this empty text field, add feeds/posts/full and click Submit Sitemap.

That's it! It should show up shortly. Check to make sure there are no crawl errors and you are done.

Tuesday, May 4, 2010

HEALTH: Tips from a Body Builder

Well, this blog cannot survive as just a purely technical blog. Especially not with me as an author. A month ago I bought a Venus Fly Trap; today I was considering buying a Mango Tree to grow in a Hardiness Zone 5 region and was thinking about learning Bonsai techniques so I can prune a large Nam Doc Mai Mango into a size able to be grown in a container. My point is, my brain does a lot more than think about technical things and my wife is loving and patient enough to let me entertain some of my weird passions/hobbies. Heck, last week I freelanced as a second photographer for a friend of mine who is a professional wedding photographer!

So, while cleaning up some files on my computer, I found a document I jotted down some notes in from 2 years ago. My cousin's husband was visiting America for a Body Building competition and stayed with us for a few days. He gave me some great tips on how to get in shape and get definition in my body. I do go to the gym a few times a week and martial arts occasionally, so this advice might have stood a chance of being followed, but my laziness won over and I have never actually followed his "plan to bodily perfection". However, I will share it here so this jewel of information is not lost and perhaps someone will benefit from it.

Morning
--) 2 eggs (whites only)
--) Oatmeal or potatoes for carbs
All meals
--) Avoid fried or oily foods but anything is good
Sweets
--) Sweets are okay during day
--) No sweets after 6pm
Nighttime
--) Eat protein before sleeping
--) No sweets
--) Avoid carbs (can eat but very little)
Working out
--) 30 min to 1 hour before workout
~~~~~) Whey protein prior to strength training
--) Immediately after workout
~~~~~) 5 tablets of Amino Acids
~~~~~) 5 gram scoop in water of Glutamine
--) 3 times a week, cardio
--) Massage muscles often
--) Sauna every 15 days (jump in cold water immediately after)

Sunday, May 2, 2010

CVS: Getting into the mind of CVS and how it works

Since some of my posts are discussing manipulating CVS backend files, it is important to now explain a little bit about the way CVS works.

CVS only cares about its own copy of the code in $CVSROOT. Each folder under $CVSROOT is called a module. The files in here are kept in the same structure with the same filenames, EXCEPT every file has a ",v" at the end of it. No need to go into why, just understand that these are the backend files of files under CVS control and that each ",v" file contains all the version history of the file in it. Developers should NEVER be here, never look here, and I have probably done wrong just telling you about it because your curiosity may cause you to go look in this folder. Screwing up something here can be disastrous.

So why doesn't CVS care about your files? It doesn't need to until you try to do something using the "cvs" command. First you do a checkout to get code to your local area to work on.

$ cvs co <module>

The "co" is a shortcut for "checkout" but either can be used. This will get the latest copy of the code out of CVS, called the HEAD...or the HEAD of the TRUNK. We can talk more about HEAD, TRUNK, BRANCH later when we talk about proper revision control

Once the code is checked out, you will notice a CVS sub-folder in every directory. This CVS folder contains the information about
1. Where the CVS repository is (can even be on a remote machine -- this may hardcode usernames)
2. What module you have checked out from the repository
3. What tag/branch if any
4. What versions of every file is checked out

If this CVS folder gets deleted, there is no link back to CVS. When you do a commit to CVS, it goes into this folder and looks up the data on where to put the files. It will also do checks to make sure you have the latest code prior to allowing you to commit. There is a nuance to be aware of when using code on remote servers: If $CVSROOT is on a remote machine, then your Root file in CVS folder probably contains a :pserver: line with your username hardcoded. This means that, if you are sharing code with another developer (copy and paste whole directories), they will get your username all over the place. I wrote another blog post on how to quickly correct this problem. If the CVS repository is on the same server as you are checking out code to, this problem isn't manifest.

We'll talk more in later posts about how to manipulate CVS in the backend for restructuring code. This a dangerous, but occasionally necessary task.

CVS: Changing username in CVS/Root files

It is possible to share a CVS checkout that has been compiled (to avoid having everyone need to recompile it) by just changing the username in the Root files. This only really applies if your code resides on a remote server and you are using :pserver: or some other protocol that hard-codes your username into the $CVSROOT.

find -path '*CVS*' -name Root | xargs perl -p -e "s/pserver\:<user1>/pserver\:<user2>/g"

This changes <user1> to <user2>. For debugging purposes, the test executables have the source file locations hardcoded. So even if the directory is copied and the above change is made, debugging will still point to the original file locations (if it exists). This also happens if using shared objects or dynamically linked libraries rather than static libraries. This is frustrating but does not defeat the usefulness of this concept.

WINDOWS: Permission Denied to Network Drive/Share

A strange case manifested itself where a user could not log onto a Linux Samba share drive. He could log on from another machine, so there was no problem with the share, but while attempting access through his Remote Desktop session on the server, he was given a message about being locked out or permission denied to the shortcut or network resource. An odd solution did the trick; searching for the computer, creating a link on the desktop for it, then opening the link works fine to connect to the share. After that happens, it will work from the original link.

Using Search/files and folders/computers, search for the network/computer name
The computer should be found. Clicking on it from here will still not work.
Make a shortcut of the found link on the desktop.
Then clicking this desktop shortcut, you will be taken to the computer/server you were looking for.
The strange part is, now, after clicking on the shortcut, both the original link and using \\<server name> from the address bar will work.

Saturday, May 1, 2010

PIBS: Changing Speeds/Frequencies

While setting up a factory AMCC 440GX Eval Board, it may be necessary to change the default settings for frequency, CPU speed, etc.

To change the boards to 533MHz for the CPU speed, use the following command in PIBS:

chipclk prom0 cpu 533333333 152380952 76190476 76190476 152380952 30000

To change the boards to 800MHz for the CPU speed, use the following command in PIBS:

chipclk prom0 cpu 800000000 133333333 66666666 66666666 66666666 30000

Type "help chipclk" into PIBS to see information on what all the arguments mean. The above commands switch back and forth between the two configurations.

GHS: Flashing networking settings to kernel on board

After the GHS Integrity Kernel boots for the first time, the IP address and Netmask need to be set again and saved there to non-volatile memory. These commands should be run after the Kernel banner is displayed and commands can once again be typed into Hyperterminal. A quick test would be to first type "help" to see if the Kernel help menu is displayed (this is different than the PIBS help menu). There is no prompt so just type the following into the empty area below the Kernel banner:

nc I 192.168.0.13
nc s
nc N 255.255.255.0
nc s
nc H Board3
nc s

The "nc I" sets the IP for the board (customize it appropriately), the "nc N" sets the Netmask, the "nc H" sets the Hostname of the board that is displayed when the banner comes up, and the "nc s" after each command saves the info to non-volatile memory. You could just type the "nc s" once at the end instead of between each command--depends on your level of paranoia.

PIBS: Timeout or Connection Refused

If using FTP or mounting an NFS share or doing some other network related activity on the board (other than connecting Multi with RTSERV), timeouts or connection refused messages may occur. The solution is simple: If the date/time on the board is way off it will need to be set correctly. Use the following:

PIBS $ dateset YYYY MM DD
PIBS $ timeset HH MM SS

Tuesday, April 27, 2010

CVS: Update a Tagged set of files to Head

If you create a minimal set and tag it from a larger set of code in a module, you may want to preserve this set of code, but get later versions. This is useful for delivery when we create a minimal set to determine ONLY the files appropriate for sending to the customer, then using that same list for each subsequent delivery rather than going through the effort again.

First check out the old tag:

cvs co -r <tag_name> <module_name>

Then tell CVS to go inside that folder/module that you checked out and update files to HEAD...but ONLY those files, not the entire repository.

find -type f ! -path '*CVS*' | sort | xargs -n1 cvs update -A

Now if you want to retag just these files, use the instructions for "Tagging Individual Files" in an earlier post.

Sunday, April 25, 2010

CVS: Tagging Individual Files

In order to tag individual files for creating a minimal set for instance, delete all the extra files from the directory, but leave in the CVS directories. Then run the following command replacing GENERIC_TAG_NAME with the tag for CVS.

$ find -type f ! -path '*CVS*' | xargs -n1 cvs tag GENERIC_TAG_NAME

This will find all files, ignoring everything in the CVS folders, and then run the cvs tag command on them. If it fails to tag one file, it will continue on to the next.

CVS: Making a file Binary after its been Checked In

Here is the command to make a file type binary if it is not listed as binary in the repository:

$ cvs admin -kb <filename>

You can go to the cvswrappers file in the CVSROOT directory on the server and specify all the files that you want to be binary by default.

CVS: Fix Wrong Commit Message

Occasionally, you or a team member will commit code with the wrong commit message or with a blank message. Fixing this is dangerously easy.

$ cvs admin -m1.3:"my message will replace" FILENAME

Where 1.3 is the version number that had the incorrect commit message. This will replace the message previously given.

To do a multi-line commit message, just hit enter before putting the end quote to the comment and continue writing the comment. Close the quote and put the FILENAME when done and hit enter. Something like this:

$ cvs admin -m1.3:"my message
>will
>replace" FILENAME

The > on each new line is from Linux signifying a carriage return from the previous line, but that it is all still one command.

Saturday, April 24, 2010

CVS: Fixing a misnamed tag

If importing code and the wrong tag was used, before anyone gets messed up, do the following:

Create a new tag pointing to the old tag:

$ cvs rtag -r <Existing_WRONG_Tag> <New_Tag> <module>

Delete the old tag:

$ cvs rtag -d <Existing_WRONG_Tag> <module>

This is how you would rename any tag, not just one made on import of new code, but the cautionary note is that messing around with existing tags can be very dangerous. If the tag is already used in a release, then you have broken your traceability. If the tag is something developers have checked out already, there is now a breakdown in consistency of everyone's configuration. Use this with caution. Luckily, tags do not do any permanent damage to the code repository, but for traceability, try not to change them at will.

LINUX: Process information--priority and niceness

When doing Linux system administration, it can often become necessary to see what the "niceness" or priority of a certain process is. The concept of a process being "nice" has to do how it hogs system resources. We're not going to discuss the details of what priority and nice are here, hopefully if you are reading this you already know. The point though is that you can use the below command, especially the forest hierarchy, to debug problems. Just typing in "ps" doesn't give you this information natively, so you have to pass it a few options to get it. And when looking at the hierarchy, you can determine if a parent process has low priority or a high nice value. For example, if you look at the details for a GCC build and see that it is running with normal values, but it is taking way to long, looking at the forest hierarchy may reveal that the bash shell it is spawned under is running with low priority and high nice value, which will affect the overall performance.

To get process information showing priority and niceness:

ps -o pid,ni,pri,comm f -U <username>

-U specifies user. The "f" shows a ASCII art forest hierarchy.

LINUX: RPM packages

Installing an RPM is rather simple (if the dependencies aren't a problem). Simply type:

rpm -i <package>.rpm

However, you may want to see what scripts are going to be run prior to running it. Or even after, if you want to know where things were installed, you can get that information by typing:

rpm -q -i -l --scripts -p <packagename>.rpm
rpm -q -i -l --scripts <installed name>

The only difference between the above 2 commands is the first one is run against the rpm and can be used prior to installation, while the second one can be run on an installed package and does not need the original rpm.

LINUX: Setting the Date/Time

These instructions were written up for a Linux Fedora Core 3 system. I think it is handled differently on later versions. And ideally you should just point to a NTP server to make your life easier, but in the event that your company firewalls wont allow that, these instructions will help for a standalone system.

There are more complicated explanations on making sure the timezone is set properly and such, but to simply change the time/date, use the following to set the UTC time. Notice that UTC is different than the current time zone most likely, so calculate it accordingly. It is easy enough to just google "time UTC" and find out what it is at this moment.

$ date -u mmddhhmmyyyy.ss

Obviously mm is month, dd is day, hh is hour in 24-hour time, mm is minutes, yyyy is 4 digit year, and ss is seconds. Again, this is the UTC. Once you set that, typing date will show the current time/date based on your zone. To set the zone, you can link /etc/localtime to the right timezone file in /usr/share/zoneinfo by typing:

$ ln -s /usr/share/zoneinfo/EST /etc/localtime

If you have a /etc/localtime file there already, you will need to move it out of the way.

Friday, April 23, 2010

LINUX: Segfault and Core Dump

If you are encountering a segfault while running applications and want to see the core dump to find out what is wrong, you need to set the core file size to unlimited. Do that by typing the following at a command prompt in linux shell

$ ulimit -c unlimited

Closing your terminal session will end this ulimit command.

LINUX: If-Else Shell Command (OR operator)

A useful command is the || operator that can be used on the linux command line. It is similar to saying "if not, then do". An example of usage would be if you wanted to put something in a script that would launch a tool, say cervisia for example, but you want to make sure that if it is already running, another instance isn't launched, you could do the following.

ps ux | grep -q cervisi[a] || cervisia . &

Now, there is an interesting thing being done with grep. In order to prevent grep from finding itself in the running processes list, use a regular expression for the last character of the process name you are looking for. Hence, the "a" in cervisia is put in brackets [] to make it a regular expression and prevent the duplication of showing the grep command in the results. Note that the two commands above, ps and grep, can be accomplished with the more efficient pgrep command, which is the combination of the two commands already part of the linux shell.

PIBS: Initial setup of PIBS boot loader

We use an IBM (or now AMCC) 440GX Ocotea Eval Board. The environment we have is multiple 440GX boards connect to a host machine via a switch that acts as a private LAN. The eval boards never touch the network. The host machine has to have 2 Ethernet Cards (we used a USB adapter version to get a second NIC). The first connects to the network (on the domain), while the second connects the host machine to the private LAN switch. Thus, you can have users Remote Desktop into the Host machine to work on the boards, without ever having to physically be in the lab. If you have a remote reset capability enabled, they can do everything remotely, and putting the boards together like this allows for sharing of resources--meaning cost savings since you will not need a one-to-one of boards to developers.

There is some setup that needs to be done first. To get the boards to work right. We are assuming you have the connections as shown above and that you have a kernel sitting on the host machine (with TFTP enabled) so that the boards can boot from the host machine.

Setup the configuration for the first ethernet port eth0. Here in this example, the board is assigned static IP address 192.168.0.13. Change this to whichever IP you want for your network.

PIBS $ set ifconfigcmd0=ent0 192.168.0.13 netmask 255.255.255.0 up

Assign the location of the kernel on the host machine so that it can be grabbed through the TFTP server.

PIBS $ set bootfilename=C:\tftpboot_gx\integrityappmono3.bin

Set the address of the host machine where the TFTP server is located and the kernel is saved.

PIBS $ set ipdstaddr0=192.168.0.1

Tell the board to use ethernet to get the kernel to boot from on reset/power on.

PIBS $ set autoboot=eth

Sometimes it takes about 3 - 5 seconds to run the ifconfigcmd to bring up the eth0 device. Set the delay to 5 or more seconds; feel this part out since some boards take longer than others for some reason. It seems that the newer AMCC Ocotea boards are slower to bring up the eth0 than the older IBM Ocotea boards. 7 seconds is recommended.

PIBS $ set autobootdelay=7

Thursday, April 22, 2010

GHS: Mount a Remote NFS share

A little background to understand this post. We use Green Hills Software (GHS) Multi to compile code for GHS Integrity, their Real Time Operating System (RTOS) that can be used in embedded systems. If you want to know more about Green Hills Software, you can go to their site (www.ghs.com), but here I am just going to be sharing some gotchas that I've gone through with GHS.

At one point, the amount of files that needed to be ftp'd to the 440GX eval board (simulates target hardware when unavailable) to the local filesystem for use by the application software exceeded the allowable space on the FFS. Therefore, we had to create an NFS share on the host Windows server and mount it when loading the kernel. The downside: The address of the NFS share is hard-coded into the kernel and you need to make 1 kernel per board. The upside: It doesn't effect the design of the waveform or test software. It is transparent. Also, it is only limited by the size of the server harddrive.

The first part is to start a project in Multi to create a kernel for the IBM 440GX. Include a File System. When it comes up, you will see an ivfserver_module.gpj. It is probably a good idea to go make a copy of this and put a local version in the diercotry where this kernel project is so that you dont directly change the GHS standard files. You will have to do this for each file directly under the gpj as well.

Next, modify your customized copy of ffs_mountable.c and change the section at the bottom to look like:

vfs_MountEntry vfs_MountTable[] = {
{
"192.168.0.1:/Board1",
/* "192.168.0.1:/Board2", */
/* "192.168.0.1:/Board3", */
/* "192.168.0.1:/Board4", */
"/",
MOUNT_NFS,
0,
MNTTAB_MAKEMP,
0
},
{NULL, NULL, NULL, NULL, 0, 0} /* Must end with NULL/0 entry */
};

This hardcodes a NFS share into the kernel, so you need a seperate kernel for each board. The example above shows a kernel for Board1 which mounts a NFS share from the Host (192.168.0.1) called Board1. You can compile at this time, but do not load the kernel till you have created the specified NFS share on the host.

To create the share, use the Server for NFS built into Windows 2003 Server. You have to install it by going to Add/Remove Programs, selecting the Add additional components, and selecting it from there. You will need to put in the installation CD.

Wednesday, April 21, 2010

LINUX: Download / Copy Entire Websites

There is a Linux command, wget, that allows for getting webpages. Sometimes using wget in recursive mode will not allow you to get more than one page. In order to get a whole site, first edit the /etc/wgetrc file to turn robot=off, then use the following command:

wget --no-parent --wait=20 --limit-rate=20K -r -p -U Mozilla http://mxr.mozilla.org/mozilla/source/webtools/bonsai/index.html

What this will do is tell the receiving server that we are using a Mozilla browser (not a script), and will wait in between each fetch to simulate a human user. The no-parent switch will prevent it from following a bunch of links and going all over the place.

NOTE: this should only be used to obtain something you are allowed to obtain.

LINUX: Copy Files With Structure

Copying select files from one folder to another while preserving the files in same structure/hierarchy can be done simply within the find command.

If you have files in directory trunk that need to be copied to another folder called branch that already has the same structure, first make sure trunk and branch are at the same level, then cd into trunk. Next run the below command to take all files of .mp* format (that is for example, you can do something else) from one to the other.

find -type f -iname '*.mp*' -exec cp -p {} ../branch/{} \;

If the structure is NOT the same, you will need to create the appropriate structure FIRST. Do the following from within the trunk directory:

find -type d -exec mkdir -p ../branch/{} \;

LINUX: Checking total size of directories

To find the total usage per user for their home directory without finding info on each subfolder, use the du command specifying the max-depth to be = 1 so that subdirectories are not shown on the screen, but they are still calculated in the total.

$ du -h --max-depth=1

Then to sort this list, redirect it to a file such as mysize and run the following to get an ascending list of offenders in the hundreds of MB:

$ du -h --max-depth=1 > mysize

$ grep -e [0-9][0-9][0-9]M mysize | sort

And of course run a simple grep G mysize to see gigabyte offenders.

Changing Line Endings with VI

dos2unix is an easy and popular way to change a file from DOS line endings to Unix line endings. But it isn't foolproof...

When dos2unix doesn't seem to get all the line endings, within VI or VIM or GVIM, try:

:g/^M/s///g

where ^M means ctrl-V ctrl-M

or try

:set fileformat=unix

Note, for those who don't know how to use VI, there are tons or resources online, but a quick crash course (maybe I'll do more in another post): Typing vi, vim, or gvim will typically launch a vi-like editor. They all work by typing a colon (:) to do special commands like save, quit, and much more. To begin typing, you have to tell it to go into insert mode by simply typing "i" for insert (or "a" for append to end of line or "o" for new line then insert). It seems complicated, and it is, but you get used to it and then it is super powerful. Use :x to save and quit when done.

Changing permissions in Linux

Have you ever needed to change permissions on directories (or files) in Linux, but sometimes need to only do it for a certain set? There are a bunch of different tricks to get the job done.

Take as a simple example, wanting to change permissions on all directories so that any future files created under them will have this permission. If you wanted to preserve the files as is, but put what is called a sticky bit onto the directories for all future files, do the following:

$ find -type d | xargs chmod g+s

The left side of the pipe lists all directories from where you are. The right side of the pipe will set the group bit permissions on all dirs in the find list on the left side of the pipe. The group bit will force all files created in those directories from now on to be owned by the user creating it and by the group of the parent directory.

How about if you wanted to find out which file/folders are NOT owned by a certain group (perhaps to change their ownership or permissions)?

find ! -group <name_of_group>

This will list all files/folders that are NOT owned by the group mentioned. You can then put that into a pipe to change ownership or permissions as shown above.

Tuesday, April 20, 2010

Search & Replace using Perl command

Searches for all files recursively that contain <string2replace> (put any string you want here, this is just for my example) and prints out a list of just the filenames.
The xargs command then passes that list to perl. Our friend perl then does -p to run script on every line of the file, -i to edit in place, -e run the following script (input from command line), 's/???/!!!/g' replaces ??? with !!! and /g says to do it for multiple occurrences on the same line. Use the first command without -i and piped to less to just see an output of all the changes it will make if using -i just to check first. The -i does an inline replace and that is permanent (shows nothing on screen). Without the -i it sends the expected results to standard out (shows it on the screen) but doesn't actually do the change. Think of it as preview-mode.

First test it to see if it looks good.

$ grep -lr <string2replace> . | xargs -n1 perl -p -e 's/<string2replace>/<replacement>/g' | less

Do it for real.

$ grep -lr <string2replace> . | xargs -n1 perl -p -i -e 's/<string2replace>/<replacement>/g'

Using special characters in a blog or XML files

Well, I just had to learn this one for the last post. Certain characters are reserved for HTML and if you place them in your post it will not display properly. Technically speaking, this has to do with encoding standards and such, but that is relevant to what we are doing.

I've come across this problem when automatically generating XML files as well. We have a test harness at work called CPPUNIT that has an option to output results to an xml file. This ASSERTS or error messages are whatever the developer defines in the code, but they often won't think about how that affects XML. XML has these special character restrictions as well (just like HTML for this blog) and if it finds any of them in the XML file, the file becomes invalid. We had to do a search and replace of these special characters in order to ensure our XML files don't become invalid.

Anyway, on to the answer. If you need to place <, >, &, ", or ', then use the following table. Just type in the weird character string with semi-colon at end wherever you want the special character to appear in your blog post or XML file.