Featured Post

Reference Books and material for Analytics

Website for practising R on Statistical conceptual Learning: https://statlearning.com  Reference Books & Materials: 1) Statis...

Tuesday, December 27, 2016

Types of Variables

Please refer post on "Scale of Measurement" along with this to get a better understanding of Variables used in Statistical modeling.


Quantitative and Categorical:


Variables are also classified according to their characteristics. They can be quantitative or categorical. In order to plan a statistical analysis or interpret your results, you need to know which types of variables you have. Data that consists of counts or measurements is called quantitative data. You also hear this type of data referred to as numerical data. If you can perform arithmetic operations, like addition and subtraction, or take a sample average of your data, then you know that it is quantitative. Suppose you take a survey of the buying habits of families. An example of quantitative data in your survey is the age in years of the respondents. Age is a quantitative variable because it would make sense to compute the average age of individuals in a sample.

Quantitative data can be further distinguished by two types: discrete and continuous. Discrete data consists of variables that can have only a countable number of values within a measurement range. That is, the values can be 0, 1, 2, 3, and so on. An example of discrete data is the number of children in a family. A family can have two or three children, but not 2.65 children. Continuous data consists of variables that are measured on a scale that has an infinite number of values and has no breaks or jumps. An example of a continuous variable is gas mileage. The gas mileage for a particular car might be 19 miles per gallon or 19.1 miles per gallon or 19.191034 miles per gallon, and so on. Remember that practical limitations can affect the precision of the measurement.

Categorical data consists of variables that denote groupings or labels. This type of data is also called attribute data. Categorical data can be distinguished from quantitative, because it does not make sense to perform arithmetic operations on categorical variables. For example, your survey includes a variable for the political party affiliation of survey respondents (Democrat, Republican, Independent, other). It doesn't make sense to try to add or average the responses Republican and Democrat.

There are two main types of categorical variables: nominal and ordinal. A nominal categorical variable exhibits no ordering within its observed levels, groups, or categories. Gender is an example of a nominal variable. There is no ordering to the groups male and female. The type of beverage you can order from a menu, such as soda, coffee, or juice, has no logical ordering to it, so it is also a nominal variable. Nominal categorical variables can be coded to appear numeric, but their numbers are meaningless. For example, the variable Gender can be coded 1 for male and 2 for female. These numbers are not inherently meaningful: they could be reversed, or replaced, by any random set of numbers. A variable that lies on a nominal scale is sometimes called a qualitative or classification variable.

With ordinal categorical variables, the observed levels of the variable can be ordered in some meaningful way that implies that the differences between the groups or categories are due to magnitude. Disease condition divided into categories of low, moderate, or severe is an example of an ordinal variable. The size of beverage you can order from a menu being small, medium, or large does have a logical order to it, so it is also an ordinal variable.

No comments:

Post a Comment