Stata is a statistical software package that is commonly used in the social sciences and economics. It is widely used at IPA for data analysis and management. It offers a comprehensive library of methods for data cleaning, descriptive statistics, and econometric analysis. Stata is very well suited for research data workflows and research design tasks, including power calculations, sample design adjustments, panel data analysis, time series analysis, etc. See Stata Features for a full list of what Stata makes available.
How to install Stata?
IPA staff can download and install the relevant version (.exe for Windows, .dmg for MacOS, or .tar.gz for Linux) from IPA on the Box installation packages.
Coding Conventions
See the following resources for coding conventions in Stata:
Within a Python script or Jupyter Notebook, you can call Stata using pystata.
import stata_setup# set configuration to the path where Stata is installed and the flavor of Stata# in the case below, we're using Stata 18 SEstata_setup.config("C:/Program Files/Stata18/", "se")
___ ____ ____ ____ ____ ®
/__ / ____/ / ____/ StataNow 18.5
___/ / /___/ / /___/ SE—Standard Edition
Statistics and Data Science Copyright 1985-2023 StataCorp LLC
StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-782-8272 https://www.stata.com
979-696-4600 service@stata.com
Stata license: Unlimited-user network, expiring 22 Jan 2025
Serial number: 401809300803
Licensed to: Niall Keleher
Innovations for Poverty Action
Notes:
1. Unicode is supported; see help unicode_advice.
2. Maximum number of variables is set to 5,000 but can be increased;
see help set_maxvar.
Or use IPython magic commands to run Stata code in a Jupyter Notebook.
%%statasysuse auto, cleardescribe
. sysuse auto, clear
(1978 automobile data)
. describe
Contains data from C:\Program Files\Stata18/ado\base/a/auto.dta
Observations: 74 1978 automobile data
Variables: 12 13 Apr 2022 17:45
(_dta has notes)
-------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------
make str18 %-18s Make and model
price int %8.0gc Price
mpg int %8.0g Mileage (mpg)
rep78 int %8.0g Repair record 1978
headroom float %6.1f Headroom (in.)
trunk int %8.0g Trunk space (cu. ft.)
weight int %8.0gc Weight (lbs.)
length int %8.0g Length (in.)
turn int %8.0g Turn circle (ft.)
displacement int %8.0g Displacement (cu. in.)
gear_ratio float %6.2f Gear ratio
foreign byte %8.0g origin Car origin
-------------------------------------------------------------------------------
Sorted by: foreign
.
%stata scatter mpg price
Data Visualization
Consider installing the ipaplots for the IPA graph schema in Stata.
Learning References
For more information on learning and using Stata, see the IPA-Stata-Trainings repository on GitHub.