Variable Parsing Au Bash

Every command-line processor has a way to allow for embedded variables. Those familiar with DOS will recognize percent encoded variables, e.g. "echo I am %USERNAME%". Those more familiar with Unixoids will be more relaxed around dollar variables, e.g.: "echo I am $USERNAME". Either way gives possibility to combine fixed text with some external value. I personally prefer bash-style so I will continue with it.

If we try to parse "User $USERNAME is working on $OS using $NUMBER_OF_PROCESSORS processors.", this is really easy for human to tackle. Even on first glance, we know that result should be "Joe is working on Windows NT using 1024 processors.". Making computer recognize that is a bit more involved.

Well, generic solution is really simple. Just make an function that will extract all variables from string and ask us what values might those be:

string ParseVariables(string text) {
var res = ParseVariablesStateMachine(text, delegate(string variable) {
return Environment.GetEnvironmentVariable(variable);
});
return res;
}

As you can see from above code, we call into ParseVariablesStateMachine method giving it whole input text alongside with a callback delegate. Function will take care of parsing variables out and ask us what each value should be via delegate function. In our example we do simple environment variable lookup but this can be modified to return almost anything.

This delegate will return value back into calling method and that method will continue string processing until next variable comes along. At that time it will repeat call to our delegate; rinse and repeat. At the end it will return value composed of all these variables in one nice string.

One way to tackle this problem is by using a simple state machine (yes, name was kinda giving it away). We definitely have starting state, we have state where we found inner variable, and state where variable is being processed. Code may look something like this (heavily redacted):

string ParseVariablesStateMachine(string text, Func<string, string> variableCallback) {
var state = ParsingState.Default;
for (int i = 0; i < text.Length; i++) {
switch (state) {
case ParsingState.Default:
if (text[i] == '$') { state = ParsingState.VariableStart; }
break;

case ParsingState.VariableStart:
state = ParsingState.NormalVariable;
break;

case ParsingState.NormalVariable:
if (!char.isLetterOrDigit(text[i])) {
variableCallback.Invoke(text.substring(variableStart, variableLength));
state = ParsingState.Default; //search for next variable
}
break;
}
}
return parsedString;
}

Full example source is available for download.

Modifying ParseVariablesStateMachine mathod to parse DOS-style variables is exercise left to a reader. It is actually simpler than bash-style parsing because we know when variable ends based on closing percent sign.

PS: Full example also covers extended variable style (e.g. ${USERNAME}.).

Leave a Reply

Your email address will not be published. Required fields are marked *