Strip the spaces from a string

May 5, 2009 at 3:23 am 2 comments

| Gabriel |

Because both Stata and the OS treat (non-quote-escaped) whitespace as syntax parsing, I try to keep spaces out of strings when possible and either just run the words together or put in underscores. I especially like to do this for anything having to do with file paths. On the other hand I sometimes want to keep the spaces. For instance, if I have a file with lots of pop songs (many of which have spaces in their titles) and I want to graph them, I like to have the regular spelling in the title (as displayed in the graph) but take the spaces out for the filename. I wrote a little program called “nospaces” to strip the spaces out of a string and return it as a global.

capture program drop nospaces
program define nospaces
	set more off
	local x=lower("`1'")
	local char ""
	local char "`2'"
	local nspaces=wordcount("`x'")-1
	forvalues ns=1/`nspaces' {
		quietly disp regexm("`x'","(.+) (.+)")
		local x=regexs(1)+"`char'"+regexs(2)
	quietly disp "`x'"
	global x = "`x'"

Note that I’m too clumsy of a programmer to figure out how to get it to return a local so there’s the somewhat clumsy workaround of having it return a global called “x.” There’s no reason to use this program interactively but it could be a good routine to work into a do-file. Here’s an example of how it would work. This little program (which would only be useful if you looped it) opens a dataset, keeps a subset, and saves that subset as a file named after the keep criteria.

local thisartist "Dance Hall Crashers"
local thissong "Fight All Night" 
use alldata, clear
keep if artist=="`thisartist'" & song=="`thissong'"
nospaces "`thisartist'"
local thisartist=$x
nospaces "`thissong'"
save `thisartist'_$x.dta, replace

Entry filed under: Uncategorized. Tags: , .

Soo-Wee! Computer viruses, herd immunity, and public goods


  • 1. Mike3550  |  May 5, 2009 at 7:04 pm

    Gabriel – I think that you can return a local by defining the program as an rclass program:

    program define nospace, rclass

    Then, add a line in the program

    return local x “`x'”

    After the program runs, you can type

    return list

    and you will see the string in the macro r(x) and you can refer to it in a following command by `r(x)’. It only works if the string is <=255 characters.

    Also, could you use Stata’s built-in “subinstr” string function to do this?

    local x = subinstr(“`thisartist'”,” “,””,.)

    Again, this will only work for strings <= 255 characters, though. For strings that are longer, you could use a local extended function:

    local x : subinstr local thisartist ” ” “”, all

    or, if there is the possibility that apostrophes or quotes are in the string:

    local x: subinstr local thisartist `” “‘ `””‘, all

    It seems like that might be easier, although I could be missing something obvious.

  • 2. gabrielrossman  |  May 6, 2009 at 3:04 pm

    this is great advice (and not for the first time). first of all it’s a bit embarassing to have constructed a total rube goldberg device when “subinstr” will get the job done. second, your code on how to get it to return a local sounds very useful. when i have a chance i’ll test it out and update the post.

The Culture Geeks

%d bloggers like this: