String Data til Numeric Data

Svar på indlæg
Af Diprete Iyabi @ 5 apr. 2014 18:17

Hi! How does one change string data in this format (below) to readable format in Stata! Thank You

2011(1)
2011(2)
2011(3)
2011(4)
2011(5)
2011(6)
2011(7)

Svar og citér
Diprete Iyabi

Diprete Iyabi

Antal indlæg: 11
Medlem siden: d. 05. april 2014
Af Søren Skotte Bjerregaard @ 6 apr. 2014 00:12

Say x is your string variable.

Then usually I use the following command:

encode x, gen(y)

y will then be generated as numeric values of the string variable x.

You can also find it explained in more detail here:

http://www.stata.com/support/faqs/data-management/encoding-string-variable/

Best regards, Søren

Svar og citér
Søren Skotte Bjerregaard

Søren Skotte Bjerregaard

Antal indlæg: 5
Medlem siden: d. 12. marts 2012
Af Anders Munk-Nielsen @ 6 apr. 2014 09:56

Alternatively, if you want to create a "year" and a "month" variable based on the "yearMonth" variable that you describe, try the following code. 

  • g year = regexs(1) if regexm(yearMonth,"^([0-9][0-9][0-9][0-9])(\([0-9]+\))$")
  • g month = regexs(2) if regexm(x,"^([0-9][0-9][0-9][0-9])\(([0-9]+)\)$")
  • destring year , replace 
  • destring month , replace
Then if you prefer a date in a more standard format, you could create a variable that's the first in each of the months by writing 
  • g date = mdy(month,1,year)
  • format date %td
Hope that helps. 

Svar og citér
Anders Munk-Nielsen

Anders Munk-Nielsen

Antal indlæg: 39
Medlem siden: d. 01. juni 2011
Af Diprete Iyabi @ 6 apr. 2014 21:52

Thank you both soo very much!!!

I have another question! Is there a way to create dummy variables and specify like what year you want it to taken into account two variables? 

Svar og citér
Diprete Iyabi

Diprete Iyabi

Antal indlæg: 11
Medlem siden: d. 05. april 2014
Af Søren Skotte Bjerregaard @ 6 apr. 2014 22:54

if you need a dummy for 2007 and your variable is called year this should work

gen d07=year==2007

if you need a lot of dummies there are probably more elegant ways to create them automatically


Svar og citér
Søren Skotte Bjerregaard

Søren Skotte Bjerregaard

Antal indlæg: 5
Medlem siden: d. 12. marts 2012
Af Anders Munk-Nielsen @ 6 apr. 2014 22:59

Totally agree with Søren's reply, that's by far the easiest way to do it. 

If you need to do it for a lot of values, maybe try

  • tab year , gen(dYear)
then Stata will generate the dummies "dYear2011", "dYear2012", ... 
Was that your question? 

Svar og citér
Anders Munk-Nielsen

Anders Munk-Nielsen

Antal indlæg: 39
Medlem siden: d. 01. juni 2011
Af Diprete Iyabi @ 6 apr. 2014 23:11



Yes pretty much thank you so much! but say for example in relation to what i asked about dummy variable (table below)! how do i make a dummy variable that illustrates whether a person can speak english and french 1, and if not 0 based on the years and vice versa? 

Thank You


languages1languages2
1990englishfrench
1990englishgerman
1991frenchrussian
1991frenchenglish


Svar og citér
Diprete Iyabi

Diprete Iyabi

Antal indlæg: 11
Medlem siden: d. 05. april 2014
Af Søren Skotte Bjerregaard @ 7 apr. 2014 00:29

Hmm more difficult. You could try and create a dummy for each year and interact it with a 'language' dummy. If you don't have too many languages to deal with an  approach could be the following: for english and french

gen EF = 0

replace EF = 1 languages1==english & languages2==french

(and add this if the order of languages doesn't matter)

replace EF=1 if  languages1==french & languages2==english

However, this approach probably becomes quite tedious if you have a lot more languages than displayed in your table. In  that case copy/paste may also have its academic merits unless someone else can come up with better suggestions! 

Svar og citér
Søren Skotte Bjerregaard

Søren Skotte Bjerregaard

Antal indlæg: 5
Medlem siden: d. 12. marts 2012
Af Diprete Iyabi @ 7 apr. 2014 00:44

Ok great thanks, yeah i have a lot more languages, and I also have to account for the years, like in some years the school didnt offer french, until a few years later and then another language is added, and so on, its really complicated.



Svar og citér
Diprete Iyabi

Diprete Iyabi

Antal indlæg: 11
Medlem siden: d. 05. april 2014
Af Patrick Kofod Mogensen @ 7 apr. 2014 17:16

Diprete Iyabi sagde:

Ok great thanks, yeah i have a lot more languages, and I also have to account for the years, like in some years the school didnt offer french, until a few years later and then another language is added, and so on, its really complicated.



Could you perhaps make an example of what the dummy would be in the above example (add another column)? I'm not sure I understand how the year is relevant to the language dummy.

Svar og citér
Patrick Kofod Mogensen

Patrick Kofod Mogensen

Antal indlæg: 554
Medlem siden: d. 29. januar 2011
Af Diprete Iyabi @ 7 apr. 2014 17:53

Hi, I figured it out actually, I had to create a do file and it worked

Svar og citér
Diprete Iyabi

Diprete Iyabi

Antal indlæg: 11
Medlem siden: d. 05. april 2014