R is a free programming language used to analyze data, do statistics, and create charts. It was created in the 1990s by Ross Ihaka and Robert Gentleman. R is popular with data scientists and researchers because it helps solve complicated data problems. Unlike other languages, R programming language is made for working with data. Which makes it great for tasks like analyzing and visualizing information. R has many tools that can be used in areas like machine learning, finance, and biology. So, this R language tutorial will teach you the basics of R, including how to set it up and work with variables, data types, and functions. It will help you in getting started with R programming.
What is R Programming Language?
R is a free programming language used for analyzing data, performing statistics, and creating graphs. It was developed in the 1990s by Ross Ihaka and Robert Gentleman and is now very popular among researchers and data scientists. R has many tools and libraries that make it perfect for working with data. Unlike general-purpose languages like Python or Java. R is specifically designed for tasks like organizing, analyzing, and visualizing data which makes it great for solving complex data problems.
History of R Programming
R programming was created in 1993 by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. It is a free and open-source tool for statistical computing and making graphs. R was inspired by the S programming language and Scheme. The goal was to make data analysis and visualization easier for researchers and statisticians. Over time, R programming language became popular because it is flexible, can be expanded with packages, and has a strong community of users. In 1997, a team was formed to manage its development. So today, R is widely used in data science to solve many types of data problems in different fields.
What is CRAN?
CRAN (Comprehensive R Archive Network) is a place where you can find R packages and software. It is the main source to download tools for data analysis and statistics. CRAN makes sure the packages work well with R and are tested. It also offers guides and manuals to help users. With thousands of packages, CRAN helps users add new features to R and improve their work.
Why Learn R Programming?
The demand for skilled professionals who can effectively manipulate and analyze data is growing rapidly. So, by learning R programming language, you open doors to various career opportunities in fields like:
- Data Analysis and Statistics: R is made for analyzing data and performing statistical tasks like testing and modeling. You can even create your functions for specific needs.
- Beautiful Visuals: With R, you can make clear as well as creative charts using tools like ggplot2 and shiny.
- Used Everywhere: From finance to biology, R is generally used in many fields to study data and make predictions.
- Free and Supported: R is free to use, has many extra features (packages), and is a helpful community to answer questions.
- Works Well with Others: It can connect with tools like Excel or Python and make reports easy to share using RMarkdown.
- Job Opportunities: R is highly valued in jobs like data science, research, and analytics.
- Easy to Learn: It is beginner-friendly, with simple syntax and many guides to help you start.
In short, R programming language is perfect if you want to analyze data, make stunning visuals, and explore new career opportunities.
R Programming Tutorial: The Basics
Let’s dive into the basics of R programming to give you a solid foundation. Here are the key concepts and steps you need to understand when learning R programming.
x <- 10 # Assign 10 to x y = 20 # Assign 20 to y (valid but less common) z <- x + y # Assign the sum of x and y to z |
B. Variable Names
- Must start with a letter or a dot (but not followed by a number if it starts with a dot).
- Can contain letters, numbers, underscores (_), and dots (.).
- Cannot contain spaces or special characters.
- Are case-sensitive (Var and var are different).
Examples of valid names in R programming language:
var1 <- 5 .my_var <- 10 this.is.valid <- TRUE |
Examples of invalid names:
1var <- 5 # Invalid: starts with a number my-var <- 10 # Invalid: contains a hyphen |
C. Checking Variable Type
You can use functions like class(), typeof(), or str() to check the type of a variable.
x <- 5 class(x) # "numeric" typeof(x) # "double" |
D. Assigning Different Data Types
R programming language supports various data types, and variables can store any of them:
- Numeric: Real numbers (e.g., 3.14, 42).
- Integer: Whole numbers, explicitly declared with L (e.g., 10L).
- Character: Strings (e.g., "hello").
- Logical: Boolean values (TRUE, FALSE).
- Complex: Complex numbers (e.g., 2+3i).
- Raw: Raw bytes.
Examples:
a <- 42 # Numeric b <- 10L # Integer c <- "Hello" # Character d <- TRUE # Logical e <- 2 + 3i # Complex |
E. Viewing Variables in Memory
You can use the following functions to inspect the variables in your environment to learn r language:
- ls(): Lists all variable names in the current environment.
- rm(): Removes variables from the environment.
ls() # List all variables rm(a) # Remove variable `a` rm(list = ls()) # Clear all variables |
F. Scope of Variables
- Global variables: Declared in the main script and available throughout the program.
- Local variables: Declared inside a function and accessible only within that function.
Example:
x <- 10 # Global variable
my_function <- function() { y <- 20 # Local variable return(x + y) # Accessing global and local variables }
my_function() # Returns 30 y # Error: object 'y' not found |
2. Data Types
In R, data types define the kind of values that variables can hold. R programming language supports several data types, each suited for specific purposes. Below is a comprehensive overview of data types in this R tutorial:
01. Atomic Data Types
Atomic data types are the most basic types in R and include:
A. Numeric
- Represents real numbers (e.g., integers and doubles).
- Default type for numbers without explicit declaration.
- Stored as double precision by default.
Examples:
x <- 42 # Numeric class(x) # "numeric" typeof(x) # "double" |
b. Integer
- Whole numbers.
- Declared using L after the number.
Examples:
y <- 42L # Integer class(y) # "integer" typeof(y) # "integer" |
c. Character
- In R programming language it represents text strings.
Examples:
z <- "Hello, R!" class(z) # "character" |
d. Logical
- Represents Boolean values: TRUE or FALSE.
Examples:
is_true <- TRUE is_false <- FALSE class(is_true) # "logical" |
e. Complex
- Represents complex numbers.
Examples:
c <- 3 + 2i # Complex number class(c) # "complex" |
f. Raw
- Represents raw bytes.
Examples:
r <- charToRaw("R programming") class(r) # "raw" |
02. Data Structures (Composite Data Types)
a. Vector
- In R programming language it is a one-dimensional collection of elements of the same type.
- Created using the c() function.
Examples:
v <- c(1, 2, 3, 4) # Numeric vector char_v <- c("a", "b") # Character vector logical_v <- c(TRUE, FALSE) # Logical vector |
b. Matrix
- A two-dimensional collection of elements of the same type.
- Created using the matrix() function.
Examples:
m <- matrix(1:9, nrow = 3, ncol = 3) |
c. List
- A collection of elements that can be of different types.
Examples:
l <- list(1, "a", TRUE) |
d. Data Frame
- In R programming language it is a two-dimensional table-like structure where each column can have different types.
- Created using the data.frame() function.
Examples:
df <- data.frame(ID = 1:3, Name = c("Alice", "Bob", "Charlie"), Score = c(90, 85, 88)) |
e. Array
- A multi-dimensional collection of elements of the same type.
Examples:
a <- array(1:12, dim = c(3, 2, 2)) |
03. Special Values
R programming language includes special values that represent certain states:
- NA: Missing or undefined value.
- NaN: "Not a Number" (e.g., a result of 0/0).
- Inf and -Inf: Positive and negative infinity (e.g., result of 1/0 or -1/0).
Examples:
x <- NA # Missing value y <- NaN # Not a Number z <- 1 / 0 # Inf |
04. Factors
Factors represent categorical data and are useful for statistical modeling.
Examples:
f <- factor(c("Male", "Female", "Male", "Female")) levels(f) # "Female", "Male" |
05. Type Conversion
In R programming language you can convert between data types using functions like as.numeric(), as.character(), etc.
Examples:
x <- "42" num_x <- as.numeric(x) # Converts "42" to 42 |
3. Vectors
A vector is a collection of elements of the same type.
# Create a vector vec <- c(1, 2, 3, 4, 5) vec |
4. Sequences
# Generate sequences seq(1, 10, by=2) # Sequence from 1 to 10 with step 2 1:10 # Sequence from 1 to 10 |
Control Structures
Control structures in R programming language allow you to control the flow of execution in your code. They help execute specific code blocks conditionally or repeatedly. Below are the main control structures in R:
a. if Statement
Executes a block of code if a condition is TRUE.
x <- 10 if (x > 5) { print("x is greater than 5") } |
b. if-else Statement
Adds an alternative block of code to execute when the condition is FALSE.
x <- 3 if (x > 5) { print("x is greater than 5") } else { print("x is not greater than 5") } |
c. if-else Function
A vectorised version of if-else, returning values based on a condition.
x <- c(1, 2, 3, 4, 5) result <- ifelse(x > 3, "Greater", "Smaller or Equal") print(result) # Output: "Smaller or Equal" "Smaller or Equal" "Smaller or Equal" "Greater" "Greater" |
d. Nested if-else
In R programming language, nested if-else chains multiple conditions.
x <- 8 if (x > 10) { print("x is greater than 10") } else if (x > 5) { print("x is greater than 5 but less than or equal to 10") } else { print("x is 5 or less") } |
Looping Statements
a. for Loop
Iterates over a sequence or vector.
for (i in 1:5) { print(i) } |
Iterating over a character vector:
names <- c("Alice", "Bob", "Charlie") for (name in names) { print(paste("Hello,", name)) } |
b. while Loop
Repeats a block of code while a condition is TRUE.
x <- 1 while (x <= 5) { print(x) x <- x + 1 } |
c. repeat Loop
In R programming language, an infinite loop must be explicitly exited with break.
x <- 1 repeat { print(x) x <- x + 1 if (x > 5) { break } } |
Control Flow Commands
a. break
Exits the current loop prematurely.
for (i in 1:10) { if (i == 5) { break } print(i) } |
b. next
Skips the current iteration and proceeds to the next.
for (i in 1:5) { if (i == 3) { next } print(i) } |
Switch Statement in R programming Language
Evaluate an expression and select a case to execute.
x <- "b" result <- switch(x, "a" = "First case", "b" = "Second case", "c" = "Third case", "Default case") print(result) # Output: "Second case" |
Applying Control Structures to Functions
Control structures are often combined with functions for more complex logic. For example:
factorial <- function(n) { if (n == 0) { return(1) } else { result <- 1 for (i in 1:n) { result <- result * i } return(result) } }
factorial(5) # Output: 120 |
Functions
Functions in R programming language are fundamental building blocks that allow you to organise, reuse, and modularise code. Here's a detailed overview:
1. Creating Functions
Functions in R are defined using the function keyword:
function_name <- function(arg1, arg2, ...) { # Function body return(value) # Optional: Returns the last evaluated value if omitted } |
Example:
add_numbers <- function(a, b) { result <- a + b return(result) }
add_numbers(3, 5) # Output: 8 |
2. Function Arguments
a. Required Arguments
All arguments in the function definition must be provided when calling the function unless default values are specified.
#Arguments in R programming language multiply <- function(a, b) { return(a * b) }
multiply(4, 5) # Output: 20 |
b. Default Arguments
You can assign default values to arguments.
greet <- function(name = "User") { paste("Hello,", name) }
greet() # Output: "Hello, User" greet("Alice") # Output: "Hello, Alice" |
c. Variable-Length Arguments (...)
The ... argument in R programming language allows passing a variable number of arguments.
sum_values <- function(...){ return(sum(...)) }
sum_values(1, 2, 3, 4, 5) # Output: 15 |
3. Return Values
Functions can return values explicitly using return() or implicitly by evaluating the last expression.
# Explicit return square <- function(x) { return(x^2) }
# Implicit return cube <- function(x) { x^3 }
square(4) # Output: 16 cube(3) # Output: 27 |
4. Anonymous Functions
Anonymous functions are created without a name, often used as arguments to higher-order functions.
lapply(1:5, function(x) x^2) # Output: List of squared numbers: 1, 4, 9, 16, 25 |
5. Nested Functions
Functions can be defined inside other functions. As well as in R programming language inner functions are only accessible within the outer function.
outer_function <- function(x) { inner_function <- function(y) { y^2 } inner_function(x) + x }
outer_function(3) # Output: 12 (3^2 + 3) |
6. Scope of Variables
R uses lexical scoping to resolve variables.
- Local Scope: Variables defined inside a function are not accessible outside.
- Global Scope: Variables defined outside functions are accessible within functions unless overridden.
x <- 10
scope_example <- function() { x <- 5 # Local variable return(x) }
scope_example() # Output: 5 x # Output: 10 (Global variable remains unchanged) |
7. Applying Functions
a. apply() Family of Functions
R programming language provides efficient alternatives to loops with the apply() family:
- apply(): Apply a function to rows or columns of a matrix.
- lapply(): Apply a function to each element of a list.
- sapply(): Simplified version of lapply() that returns a vector or matrix.
- tapply(): Apply a function to subsets of a vector grouped by another vector.
- mapply(): Multivariate version of sapply().
Examples:
m <- matrix(1:9, nrow = 3) apply(m, 1, sum) # Row sums apply(m, 2, sum) # Column sums
lst <- list(a = 1:5, b = 6:10) lapply(lst, sum) # Sum of each list element sapply(lst, sum) # Output as a vector |
8. Best Practices for Writing Functions
- Use meaningful names for functions and arguments.
- Include comments and documentation.
- Avoid hardcoding values; use arguments for flexibility.
- Test functions with different input scenarios.
9. Example: Combining Functions
Here is an example of using functions in R programming language with control structures and vectorisation:
calculate_grade <- function(score) { if (score >= 90) { return("A") } else if (score >= 80) { return("B") } else if (score >= 70) { return("C") } else if (score >= 60) { return("D") } else { return("F") } }
scores <- c(95, 82, 67, 55) grades <- sapply(scores, calculate_grade) grades # Output: "A" "B" "C" "F" |
Working with Data
1. Data Frames
Data frames are used to store tabular data.
# Create a data frame data <- data.frame( Name = c("Alice", "Bob", "Charlie"), Age = c(25, 30, 35), Score = c(90, 85, 88) ) # Access columns data$Name
# View the data frame print(data) |
2. Reading Data
# Read a CSV file my_data <- read.csv("data.csv") |
Visualisation
R programming language excels in data visualization with built-in plotting capabilities and libraries like ggplot2.
1. Basic Plot
# Simple plot x <- c(1, 2, 3, 4, 5) y <- c(2, 4, 6, 8, 10) plot(x, y, type="o", col="blue") |
2. Using ggplot2
# Install ggplot2 package install.packages("ggplot2") library(ggplot2)
# Example plot ggplot(data, aes(x=Age, y=Score, color=Name)) + geom_point() + labs(title="Scores by Age") |
Additional Resources
- CRAN Task Views - Curated packages for specific tasks.
- RStudio Cheatsheets - Handy guides for common R functions.
- Books like R for Data Science by Hadley Wickham.
- Data Science Machine Learning certification course by The IoT Academy.
What is R Programming Used For?
R programming language is commonly used in many fields, especially for working with data, statistics, and machine learning. So, here is a simple explanation of its uses:
- Statistical Analysis: R helps analyze data using methods like regression, time series, and testing hypotheses.
- Data Visualisation: It makes charts and graphs, like bar plots, scatter plots, and heatmaps, using tools like ggplot2.
- Data Cleaning: R is great for organizing messy data using packages like dplyr and tidyr.
- Machine Learning: You can build models for predicting and clustering data with tools like caret.
- Bioinformatics: It’s used in biology to analyze genes and sequencing data with Bioconductor.
- Finance: R also helps in analyzing stock portfolios and risks, and forecasting trends in business.
- Research: Academics use R for running simulations and creating detailed reports or visuals.
- Big Data: R works with large datasets and connects to tools like Apache Spark.
- Web Apps: With Shiny, R can create interactive apps and dashboards for sharing insights.
- Maps and Locations: R is used for working with geographic data, creating maps, and analyzing spatial information.
R is popular because it is free, has many libraries for different tasks, and works well with big datasets!
How Can I Learn R Programming Myself?
If you’re ready to learn R programming language, here are some steps you can take to get started:
1. Get Started with R
- R is used for data analysis, statistics, and graph making.
- Download R from CRAN and RStudio from Posit.
- Learn the basics of how RStudio works.
2. Use Beginner Resources
- Take online courses like:
- DataCamp’s R courses.
- Try interactive tools:
- Swirl (teaches R inside RStudio).
- Read books like "R for Data Science" (free at r4ds.had.co.nz).
3. Practice Writing Code
- Start with simple programs of R programming language.
- Learn about:
- Data types like vectors, lists, and data frames.
- Writing functions and using loops (e.g., for loops).
- Importing and exporting data files.
- Use example datasets like mtcars or iris to practice.
4. Learn Data Visualisation
- Use the ggplot2 package to make beautiful charts.
- Learn data manipulation using:
- dplyr for filtering and summarising.
- tidyr for organizing messy data.
5. Work on Simple Projects
- Analyse data from public sources.
- Create charts to show trends with ggplot2.
- Perform basic statistics like averages or regression.
6. Join the R Community
- Ask questions related to R programming language on forums like:
- RStudio Community.
- Stack Overflow (use the R tag).
- Reddit: r/Rlanguage.
- Explore projects on GitHub to learn from others.
7. Level Up
- Try advanced topics like:
- Machine learning with caret or tidymodels.
- Building interactive apps with Shiny.
- Read "Advanced R" by Hadley Wickham (free online).
8. Practice Consistently
- Set aside regular time for practice.
- Solve data challenges on Kaggle or similar platforms.
By practicing regularly and using these resources, you’ll become good at R programming step by step!
Conclusion
In conclusion, R programming language is a simple and powerful tool for working with data. It helps you analyze data, make charts, and solve problems with numbers. With tools like ggplot2 for making graphs and Shiny for creating web apps, R makes it easy to explore and share data. Whether you're working with small or large datasets, R can handle it all. Start with the basics, practice often, and use the many resources available to learn. So, mastering R can open up great career opportunities in data science and other fields.
Frequently Asked Questions (FAQs)
Ans. R programming is easy to learn, especially for beginners who want to work with data. It has simple rules and lots of help available, like tutorials and community support.
Ans. Choosing between Python and R depends on what you need. Python is great for many tasks, while R programming language is best for doing statistics and creating data charts.
About The Author
The IoT Academy as a reputed ed-tech training institute is imparting online / Offline training in emerging technologies such as Data Science, Machine Learning, IoT, Deep Learning, and more. We believe in making revolutionary attempt in changing the course of making online education accessible and dynamic.