Data Science and Machine Learning Internship ...
- 1k Enrolled Learners
- Live Class
SAS is the most popular Data Analytics tool in the market. This blog is the perfect guide for you to learn all the concepts required to clear a SAS interview. We have segregated the questions based on the difficulty levels and this will help people with different expertise levels to reap the maximum benefit from our blog. SAS Interview Questions blog will be a one-stop resource from where you can boost your interview preparation.
Want to Upskill yourself to get ahead in Career? Check out the Top Trending Technologies Article.
Before moving to SAS interview questions, let us understand why SAS is important. SAS is easy to learn and provides an easy option (PROC SQL) for people who already know SQL. SAS is on par with all leading tools including R & Python when it comes to handling huge amount of data and options for parallel computations. Globally, SAS is the market leader in available corporate jobs. In India, SAS controls about 70% of the data analytics market share compared to 15% for R. If you are planning to step your foot in Data Analytics, now is the right time for you to start with SAS Certification Training. Now, let us move on to some of the most important SAS interview questions that can be asked in your SAS interview.
Answer: We will compare SAS with the popular alternatives in the market based on the following aspects:
|Ease of Learning||SAS is easy to learn and provides an easy option (PROC SQL) for people who already know SQL.|
|Data Handling Capabilities||SAS is on par with all leading tools including R & Python when it comes to handling huge amount of data and options for parallel computations.|
|Graphical Capabilities||SAS provides functional graphical capabilities and with a little bit of learning, it is possible to customize on these plots.|
|Advancements in Tool||SAS releases updates in a controlled environment, hence they are well tested. R & Python, on the other hand, have an open contribution and there are chances of errors in the latest developments.|
|Job Scenario||Globally, SAS is the market leader in available corporate jobs. In India, SAS controls about 70% of the data analytics market share|
Answer: SAS (Statistical Analytics System)
Answer: The following are the features of SAS:
Figure: SAS Interview Questions – Features of SAS
Answer: The following are the four capabilities in SAS Framework:
Figure: SAS Interview Questions – SAS Framework
Answer: You can use the OUTPUT statement to save summary statistics in a SAS data set. This information can then be used to create customized reports or to save historical information about a process.
You can use options in the OUTPUT statement to
Answer: Stop statement causes SAS to stop processing the current data step immediately and resume processing statement after the end of current data step.
Answer: If you don’t want to process certain variables and you do not want them to appear in the new data set, then specify drop = data set option in the set statement.
Whereas If want to process certain variables and do not want them to appear in the new data set, then specify drop = data set option in the data statement.
Answer: We can read the last observation to a new data set using
end= data set option.
Where calculus is a new data set to be created and comp is the existing data set. last is the temporary variable (initialized to 0) which is set to 1 when the set statement reads the last observation.
Answer: The main difference is that while reading an existing data set with the SET statement, SAS retains the values of the variables from one observation to the next. Whereas when reading the data from an external file, only the observations are read. The variables will have to re-declared if they need to be used.
Answer: There are two data types in SAS. Character and Numeric. Apart from this, dates are also considered as characters although there are implicit functions to work upon dates.
Answer: Functions expect argument values to be supplied across an observation in a SAS data set whereas a procedure expects one variable value per observation.
data average ;
set temp ;
avgtemp = mean( of T1 – T24 ) ;
Here arguments of mean function are taken across an observation. The mean function calculates the average of the different values in a single observation.
proc sort ;
by month ;
proc means ;
by month ;
var avgtemp ;
Proc means is used to calculate average temperature by month (taking one variable value across an observation). Here, the procedure means on the variable month.
Answer: SUM function returns the sum of non-missing arguments whereas “+” operator returns a missing value if any of the arguments are missing.
input x y z;
33 3 3
24 3 4
24 3 4
. 3 2
23 . 3
54 4 .
35 4 2
In the output, value of p is missing for 4th, 5th and 6th observation as:
a p 39 39 31 31 31 31 5 . 26 . 58 . 41 41
Answer: PROC MEANS produces subgroup statistics only when a BY statement is used and the input data has been previously sorted (using PROC SORT) by the BY variables.
PROC SUMMARY automatically produces statistics for all subgroups, giving you all the information in one run that you would get by repeatedly sorting a data set by the variables that define each subgroup and running PROC MEANS. PROC SUMMARY does not produce any information in your output. So you will need to use the OUTPUT statement to create a new DATA SET and use PROC PRINT to see the computed statistics.
Answer: Suppose value of a variable PayRate begins with a dollar sign ($). When SAS tries to automatically convert the values of PayRate to numeric values, the dollar sign blocks the process. The values cannot be converted to numeric values.
Therefore, it is always best to include INPUT and PUT functions in your programs when conversions occur.
Answer: There are three ways to delete duplicate observations in a dataset:
Proc sort data=SAS-Dataset nodups;
2. By using an SQL query inside a procedure
Create SAS-Dataset as select * from Old-SAS-Dataset where var=distinct(var);
3. By cleaning the data
If first.group and last.group then
Answer: PROC SQL is a simultaneous process for all the observations. The following steps happen when PROC SQL is executed:
Answer: Input function – Character to numeric conversion- Input(source,informat)
put function – Numeric to character conversion- put(source,format)
Answer: Here, we will calculate the weeks between 31st December, 2000 and 1st January, 2001. 31st December 2000 was a Sunday. So 1st January 2001 will be a Monday in the same week. Hence, Weeks = 0
Years = 1, since both the days are in different calendar years.
Months = 1 ,since both the days are in different months of the calendar.
Answer: a=Road; b=NY
Answer: Scan, Substr, trim, Catx, Index, tranwrd, find, Sum.
Answer: TRANWRD function replaces or removes all occurrences of a pattern of characters within a character string.
do month=1 to 12;
Answer: Value of month would be 13
No. of observations would be 1
Data is central to every data set. In SAS, data is available in tabular form where variables occupy the column space and observations occupy the row space.
Figure: SAS Interview Questions – SAS Dates
do month=1 to 12;
Answer: We can use ‘do until’ or ‘do while’ to specify the condition.
Answer: An important difference between the DO UNTIL and DO WHILE statements is that the DO WHILE expression is evaluated at the top of the DO loop. If the expression is false the first time it is evaluated, then the DO loop never executes. Whereas DO UNTIL executes at least once.
do i=1 to 20 until(Sum>=20000);
This iterative DO statement enables you to execute the DO loop until Sum is greater than or equal to 20000 or until the DO loop executes 10 times, whichever occurs first.
Answer: This is how the scan function is used.
Here, argument specifies the character variable or expression to scan,
n specifies which word to read, and
delimiters are special characters that must be enclosed in single quotation marks.
Answer: Yes, it depends on how you use the variable. There are some numbers we will want to use as a categorical value rather than a quantity. An example of this can be a variable called “Foreigner” where the observations have the value “0” or “1” representing not a foreigner and foreigner respectively. Similarly, the ID of a particular table can be in number but does not specifically represent any quantity. Phone numbers is another popular example.
Answer: No, it must be character data type.
Answer: The number of observations is limited only by computer’s capacity to handle and store them.
Prior to SAS 9.1, SAS data sets could contain up to 32,767 variables. In SAS 9.1, the maximum number of variables in a SAS data set is limited by the resources available on your computer.
Answer: The trailing @ is also known as a column pointer. By using the trailing @, in the Input statement gives you the ability to read a part of your raw data line, test it and then decide how to read additional data from the same record.
Answer: All of the variables in a summary report must be defined as group, analysis, across or computed variables.
Answer: n-count, mean, standard deviation, minimum, and maximum
Answer: By using MAXDEC= option
indexed in the order of the BY variables.
Answer: The difference between the two procedures is that PROC MEANS produces a report by default. By contrast, to produce a report in PROC SUMMARY, you must include a PRINT option in the PROC SUMMARY statement.
Answer: By using TABLES Statement.
Answer: Adding the CROSSLIST option to TABLES statement displays crosstabulation tables in ODS column format.
Answer: To generate list output for crosstabulations, add a slash (/) and the LIST option to the TABLES statement in your PROC FREQ step.
TABLES variable-1*variable-2 <* … variable-n> / LIST;
Answer: We will use PROC MEANS for numeric variables whereas we use PROC FREQ for categorical variables.
Answer: Merging combines observations from two or more SAS data sets into a single observation in a new data set.
A one-to-one merge, shown in the following figure, combines observations based on their position in the data sets. You use the MERGE statement for one-to-one merging.
We can merge the datasets on one-to-one fashion with the below code.
data combined; merge data1 data2; run;
set a b;
The format will be the variable name ‘dollar’ which will be of length 10 numbers followed by two numbers after the decimal point.
Answer: Interleaving combines individual, sorted SAS data sets into one sorted SAS data set. For each observation, the following figure shows the value of the variable by which the data sets are sorted. You interleave data sets using a SET statement along with a BY statement.
In the following example, the data sets are sorted by the variable Year.
We can sort and then join the datasets on Year with the below code.
data combined; set data1 data2; by Year; run;
Answer: We will use the following code to rename a b to e f
data concat(rename=(a=e b=f));
Answer: If both data sets in the merge statement are sorted by id(as shown below) and each observation in one data set has a corresponding observation in the other data set, a one-to-one merge is suitable.
input id class $;
input id class1 $;
merge mydata1 mydata2;
If the observations do not match, then match merging is suitable
input id class $;
input id class1 $;
merge mydata1 mydata2;
I hope this set of SAS interview questions will help you in preparing for your interview. Further, I would recommend SAS Tutorial videos from Edureka to learn more.
This video series on SAS Tutorial provides a complete background into the SAS components along with Real-Life case studies used in Banking and Finance industries. We have personally designed the use cases so as to provide an all round expertise to anyone running the code.
Got a question for us? Please mention it in the comments section and we will get back to you at the earliest.
If you wish to learn SAS and build a career in domain of SAS and build expertise to understand how SAS works in the back-end, explore data with SAS procedures, apply various data mining techniques and perform work on a industry-relevant case study, check out our interactive, live-online SAS Certification Training here, that comes with 24*7 support to guide you throughout your learning period.