The other day I was solving a seemingly simple problem. While solving the problem I came across several hiccups one could have to consider (I say could given some unique environments this problem could surface in, ie. performance a primary concern). While working the solution to completion, I got to thinking about other ways I could solve the problem. It got me to thinking Programming Pearls, a great book about solving problems effectively.
Now that I've "solved it" myself, I'm curious what you could, the readers of this blog could come up with and I, by extension, can learn from.
The Problem at Hand
When given some text as an input, that text could include some "bad" characters. For example, in Microsoft Word when typing a hyphen, Word will change the hyphen to a dash, since grammatically speaking a dash is different from a hyphen and most often you, the Word user, mean to use a dash so Word puts on there for you (thumbs up on the usability aspect of this feature Microsoft). The problem is, the dash character that Word uses is not in the ASCII character set.
With this program, we don't want to allow a certain subset of characters, and instead replace the characters with a different character. Using the Word example of a dash and a hyphen, I want to scrub uses of a dash and replace with the hyphen.
Here are some other common ones:
| Character | Correct ASCII Value | Bad Values |
| Hyphen | 45 | 8208, 8211, 8212, 8722, 173, 8209, 8259 |
| Single Quote | 39 | 96, 8216, 8217, 8242, 769, 768 |
| Double Quote | 34 | 8220, 8221, 8243, 12291 |
| Space | 32 | 160, 8195, 8194 |
What the code should do is simply take in a string, and output the string with the offending "bad" characters replaced with a specified character (using the table above). Very simple.
I've provided the stub code here and a unit test, to provide an example, based on the table above...fill in the blank and submit!
1: [Test]
2: public void Input_with_bad_characters_should_be_replaced_with_approprate_character()
3: {
4: var badChars = new[] {(char) 96, (char) 8216, (char) 8217, (char) 8242, (char) 769, (char) 768,
5: (char) 8220, (char) 8221, (char) 8243, (char) 12291,(char)8208, (char)8211, (char)8212,
6: (char)8722, (char)173, (char)8209, (char)8259,(char)160, (char)8195, (char)8194};
7: var badString = new string(badChars);
8:
9: var cleanedString = CleanInput(badString);
10:
11: Assert.That(cleanedString, Is.EqualTo("''''''\"\"\"\"------- "));
12: }
13:
14: public string CleanInput(string input)
15: {
16: // your details here
17: }
The Judging
The judging of the winner will be entirely based on my discretion. I'm looking first for correctness, that the problem is in fact solved. After that I'm open to seeing what you can do. The only limit is your creativity.
The Prize
I am offering up your choice of a book from Addison-Wesley or a $25 gift certificate to ThinkGeek
I am thoroughly excited to see what you can come up with....
(Please post submissions as a comment to http://gist.github.com Leave some way for me to get in contact with you)
Updates
Update #1: Please Make All submissions from hence forth to http://gist.github.com (Thanks Tuna)
Update #2: I will stop reviewing submissions one week after post date
Update #3: Feel free to leave your comments on who you think should win and why (I still reserve the right to overrule you :-))
Posted
08-09-2009 10:14 PM
by
Tim Barcz