SubjectsSubjects(version: 945)
Course, academic year 2023/2024
   Login via CAS
SAS Aplication in Demography III - MD360P46
Title: Demografické aplikace SAS III
Czech title: Demografické aplikace SAS III
Guaranteed by: Department of Demography and Geodemography (31-360)
Faculty: Faculty of Science
Actual: from 2019
Semester: winter
E-Credits: 4
Examination process: winter s.:combined
Hours per week, examination: winter s.:1/1, Ex [HT]
Capacity: unlimited
Min. number of students: unlimited
4EU+: no
Virtual mobility / capacity: no
State of the course: cancelled
Language: Czech
Level: specialized
Is provided by: MD360P46R
Note: enabled for web enrollment
Guarantor: prof. RNDr. Jitka Rychtaříková, CSc.
Teacher(s): prof. RNDr. Jitka Rychtaříková, CSc.
Incompatibility : MD360P46R
Pre-requisite : MD360P38
Is incompatible with: MD360P46R
Opinion survey results   Examination dates   Schedule   
Annotation -
Last update: prof. RNDr. Jitka Rychtaříková, CSc. (11.04.2012)
Applied use of Base SAS Statistical Procedures and selected procedures of SAS/STAT (BOXPLOT, ANOVA, FACTOR, STDIZE, CLUSTER, DISTANCE, TREE, VARCLUS).
Literature -
Last update: prof. RNDr. Jitka Rychtaříková, CSc. (11.04.2012)

SAS Institute Inc. 2011. Base SAS® 9.3 Procedures Guide. Cary, NC: SAS Institute Inc.

SAS Institute Inc. 2011. Base SAS ® 9.3 Procedures Guide: Statistical Procedures. Cary, NC: SAS Institute Inc.

SAS Institute Inc. 2008. SAS/STAT® 9.3 User’s Guide. Cary, NC: SAS Institute Inc.

Jan Hendl: Přehled statistických metod zpracování dat, Portál s.r.o., Praha 2004, 583s.,

ISBN 80-7178-820-1

Requirements to the exam - Czech
Last update: prof. RNDr. Jitka Rychtaříková, CSc. (11.04.2012)

Forma zkoušky ústní. Předpokladem je absolvování závěrečného písemného testu (sestavení programu)a aktivní účast na výuce.

Syllabus -
Last update: prof. RNDr. Jitka Rychtaříková, CSc. (09.03.2012)

1. The UNIVARIATE Procedure (Base SAS Statistical Procedures). Descriptive (summary) statistics based on moments (mean, variance, standard deviation, coefficient of variation, skewness, kurtosis), quantiles, mode(s), extreme values, frequencies. Confidence intervals for the mean, standard deviation, and variance. FREQ and WEIGHT statements. Histograms (HISTOGRAM), options (parametric distributions, kernel density estimation-nonparametric, graphic options). Placement of a box or a table of summary statistics in the graph (INSET). Quantile-Quantile plots (Q-Q plots), and probability-probability plots (P-P plots). Grouping data or creating comparative plots with CLASS statement. Rounding values of a variable (ROUND). Goodness-of-fit tests for a variety of distributions including the normal. http://support.sas.com/documentation/cdl/en/procstat/63963/PDF/default/procstat.pdf

2. The BOXPLOT Procedure (SAS/STAT). Box-and-whisker plots, referred also as a box plot displays the mean, quartiles, and minimum and maximum observations for a group. The length of the box represents the interquartile range (the distance between the 25th and the 75th percentiles), the dot in the box interior represents the mean, the horizontal line in the box interior represents the median, the vertical lines issuing from the box extend to the minimum and maximum values of the analysis variable. BOXSTYLE=SKELETAL (the whiskers are drawn from the edges of the box to the extreme values of the group). BOXSTYLE=SCHEMATIC, a whisker is drawn from the upper edge of the box to the largest observed value within the upper fence and from the lower edge of the box to the smallest observed value within the lower fence. http://support.sas.com/documentation/onlinedoc/stat/930/boxplot.pdf

3. The FREQ Procedure (Base SAS Statistical Procedures). Creating one-way and n-way frequency and contingency (crosstabulation) tables. Goodness-of-fit tests for equal proportions or specified null proportions, and confidence limits. Testing for association in a crosstabulation table. TABLES (specify the type of a table, 2x2 Tables - Odds ratio and Relative Risks). TEST (Chi-Square Test), Pearson Correlation Coefficient, Spearman Rank Correlation Coefficient. WEIGHT statement. http://support.sas.com/documentation/cdl/en/procstat/63963/PDF/default/procstat.pdf

4. The ANOVA Procedure (SAS/STAT). The analysis of variance (ANOVA) for balanced data. The goal is to test for differences among the means of the levels and to quantify these differences. The classification variable is specified in the CLASS statement. Response variables must be numeric. Unlike the GLM procedure, continuous variables on the right-hand side of the model is not allowed. Tukey’s multiple comparison tests for each level of the main effects can be produced. Procedure GLM for unbalanced data.

http://support.sas.com/documentation/onlinedoc/stat/930/anova.pdf http://support.sas.com/documentation/onlinedoc/stat/930/glm.pdf

5. The CORR Procedure (Base SAS Statistical Procedures). Pearson product-moment correlation (parametric measure of association for two variables. It measures the strength and direction of a linear relationship), Spearman rank-order correlation (nonparametric measure of association based on the ranks), Kendall’s tau-b coefficient (measure of association based on the number of concordances and discordances in paired observations). Pearson, Spearman, Kendall partial correlation (PARTIAL statement, a partial correlation measures the strength of a relationship between two variables while controlling the effect of other variables). FREQ and WEIGHT statements available.

http://support.sas.com/documentation/cdl/en/procstat/63963/PDF/default/procstat.pdf

6. The REG Procedure (SAS/STAT). A general-purpose procedure for regression. It performs linear regression with many diagnostic capabilities, selects models by using one of nine methods, produces scatter plots of raw data and statistics, highlights scatter plots to identify particular observations, and allows interactive changes in both the regression model and the data that are used to fit the model. Simple linear regression and polynomial regression.

http://support.sas.com/documentation/cdl/en/statug/63962/HTML/default/viewer.htm#statug_reg_sect001.htm

7. The FACTOR Procedure (SAS/STAT) performs a variety of common factor and component analyses and rotations. The purpose of common factor analysis is to explain the correlations or covariances among a set of variables in terms of a limited number of unobservable, latent variables. Factor extraction with principal component analysis, factor rotation, factor loadings, factor scores. FREQ and WEIGHT statements. http://support.sas.com/documentation/onlinedoc/stat/930/factor.pdf

8. Standardization Procedures: STANDARD, STDIZE.

• The STANDARD Procedure (Base SAS). The procedure standardizes variables in a SAS data set to a given mean and standard deviation, and it creates a new SAS data set containing the standardized values.

• The STDIZE Procedure (SAS/STAT). The STDIZE procedure standardizes one or more numeric variables in a SAS data set by subtracting a location measure and dividing by a scale measure. A variety of location and scale measures are provided. http://support.sas.com/documentation/onlinedoc/stat/930/stdize.pdf

9. MACRO Facility. Macro variables, macro Functions, macro Statements, macro Programs. Define and invoke a Macro Variable. Macro with Parameters. Positional macro arguments, Keyword macro arguments . Define and invoke Macro Program. Including External Macros, Autocall Macro Libraries.

10. The DISTANCE Procedure computes various measures of distance, dissimilarity, or similarity between the observations (rows) of an input data set, which can contain numeric or character variables, or both, depending on which proximity measure is used. Various nonparametric and parametric methods can be used for standardizing variables. http://support.sas.com/documentation/onlinedoc/stat/930/distance.pdf

11. The CLUSTER Procedure (SAS/STAT). The purpose of cluster analysis is to place objects into groups or clusters suggested by the data, not defined a priori, such that objects in a given cluster tend to be similar to each other in some sense, and objects in different clusters tend to be dissimilar. The CLUSTER Procedure performs hierarchical clustering of observations. The data can be coordinates or distances. Scaling or transforming variables. Computing Euclidean distances or using the Distance Procedure and Distance Matrix. Different (11) clustering methods, creating an output dataset (OUTTREE) in order to draw a tree diagram by TREE procedure. http://support.sas.com/documentation/onlinedoc/stat/930/cluster.pdf

12. The TREE Procedure (SAS/STAT). The tree procedure reads a data set created by the CLUSTER or VARCLUS procedure and produces a tree diagram (dendrogram or phenogram). Horizontal or vertical tree diagram. ID statement (identifies objects). http://support.sas.com/documentation/onlinedoc/stat/930/tree.pdf

13. The VARCLUS Procedure (SAS/STAT) divides a set of numeric variables into disjoint or hierarchical clusters. Associated with each cluster is a linear combination of the variables in the cluster. This linear combination can be either the first principal component (the default) or the centroid component. The first principal component is a weighted average of the variables that explains as much variance as possible. Centroid components are unweighted averages of either the standardized variables (the default) or the raw variables. A dendrogram of variable clusters is displayed.

http://support.sas.com/documentation/onlinedoc/stat/930/varclus.pdf

 
Charles University | Information system of Charles University | http://www.cuni.cz/UKEN-329.html