R is a free programming language used to analyze data, do statistics, and create charts. It was created in the 1990s by Ross Ihaka and Robert Gentleman. R is popular with data scientists and researchers because it helps solve complicated data problems. Unlike other languages, R programming language is made for working with data. Which makes it great for tasks like analyzing and visualizing information. R has many tools that can be used in areas like machine learning, finance, and biology. So, this R language tutorial will teach you the basics of R, including how to set it up and work with variables, data types, and functions. It will help you in getting started with R programming.

What is R Programming Language?

R is a free programming language used for analyzing data, performing statistics, and creating graphs. It was developed in the 1990s by Ross Ihaka and Robert Gentleman and is now very popular among researchers and data scientists. R has many tools and libraries that make it perfect for working with data. Unlike general-purpose languages like Python or Java. R is specifically designed for tasks like organizing, analyzing, and visualizing data which makes it great for solving complex data problems.

History of R Programming

R programming was created in 1993 by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand. It is a free and open-source tool for statistical computing and making graphs. R was inspired by the S programming language and Scheme. The goal was to make data analysis and visualization easier for researchers and statisticians. Over time, R programming language became popular because it is flexible, can be expanded with packages, and has a strong community of users. In 1997, a team was formed to manage its development. So today, R is widely used in data science to solve many types of data problems in different fields.

What is CRAN?

CRAN (Comprehensive R Archive Network) is a place where you can find R packages and software. It is the main source to download tools for data analysis and statistics. CRAN makes sure the packages work well with R and are tested. It also offers guides and manuals to help users. With thousands of packages, CRAN helps users add new features to R and improve their work.

Why Learn R Programming?

The demand for skilled professionals who can effectively manipulate and analyze data is growing rapidly. So, by learning R programming language, you open doors to various career opportunities in fields like:

  • Data Analysis and Statistics: R is made for analyzing data and performing statistical tasks like testing and modeling. You can even create your functions for specific needs.
  • Beautiful Visuals: With R, you can make clear as well as creative charts using tools like ggplot2 and shiny.
  • Used Everywhere: From finance to biology, R is generally used in many fields to study data and make predictions.
  • Free and Supported: R is free to use, has many extra features (packages), and is a helpful community to answer questions.
  • Works Well with Others: It can connect with tools like Excel or Python and make reports easy to share using RMarkdown.
  • Job Opportunities: R is highly valued in jobs like data science, research, and analytics.
  • Easy to Learn: It is beginner-friendly, with simple syntax and many guides to help you start.

In short, R programming language is perfect if you want to analyze data, make stunning visuals, and explore new career opportunities.

R Programming Tutorial: The Basics

Let’s dive into the basics of R programming to give you a solid foundation. Here are the key concepts and steps you need to understand when learning R programming.

Setting Up R and RStudio

  • Install R:
  • Install RStudio:

Now, here are the basics of the R programming language

1. Variables

In R, variables are used to store data that can be accessed and manipulated throughout your code. Here's an overview of variables in R:

A. Declaring Variables

Variables are created when you assign a value to them using the assignment operator <- or =. The most common method is <-.

x <- 10          # Assign 10 to x

y = 20           # Assign 20 to y (valid but less common)

z <- x + y       # Assign the sum of x and y to z


B. Variable Names
  • Must start with a letter or a dot (but not followed by a number if it starts with a dot).
  • Can contain letters, numbers, underscores (_), and dots (.).
  • Cannot contain spaces or special characters.
  • Are case-sensitive (Var and var are different).

Examples of valid names in R programming language:

var1 <- 5

.my_var <- 10

this.is.valid <- TRUE

Examples of invalid names:

1var <- 5     # Invalid: starts with a number

my-var <- 10  # Invalid: contains a hyphen


C. Checking Variable Type

You can use functions like class(), typeof(), or str() to check the type of a variable.

x <- 5

class(x)       # "numeric"

typeof(x)      # "double"


D. Assigning Different Data Types

R programming language supports various data types, and variables can store any of them:

  • Numeric: Real numbers (e.g., 3.14, 42).
  • Integer: Whole numbers, explicitly declared with L (e.g., 10L).
  • Character: Strings (e.g., "hello").
  • Logical: Boolean values (TRUE, FALSE).
  • Complex: Complex numbers (e.g., 2+3i).
  • Raw: Raw bytes.

Examples:

a <- 42          # Numeric

b <- 10L         # Integer

c <- "Hello"     # Character

d <- TRUE        # Logical

e <- 2 + 3i      # Complex


E. Viewing Variables in Memory

You can use the following functions to inspect the variables in your environment to learn r language:

  • ls(): Lists all variable names in the current environment.
  • rm(): Removes variables from the environment.

ls()            # List all variables

rm(a)           # Remove variable `a`

rm(list = ls()) # Clear all variables


F. Scope of Variables
  • Global variables: Declared in the main script and available throughout the program.
  • Local variables: Declared inside a function and accessible only within that function.

Example:

x <- 10          # Global variable

 

my_function <- function() {

  y <- 20        # Local variable

  return(x + y)  # Accessing global and local variables

}

 

my_function()    # Returns 30

y                # Error: object 'y' not found


2. Data Types

In R, data types define the kind of values that variables can hold. R programming language supports several data types, each suited for specific purposes. Below is a comprehensive overview of data types in this R tutorial:

01. Atomic Data Types

Atomic data types are the most basic types in R and include:

A. Numeric
  • Represents real numbers (e.g., integers and doubles).
  • Default type for numbers without explicit declaration.
  • Stored as double precision by default.

Examples:

x <- 42         # Numeric

class(x)        # "numeric"

typeof(x)       # "double"

 

b. Integer
  • Whole numbers.
  • Declared using L after the number.

Examples:

y <- 42L        # Integer

class(y)        # "integer"

typeof(y)       # "integer"


c. Character
  • In R programming language it represents text strings.

Examples:

z <- "Hello, R!"

class(z)        # "character"


d. Logical
  • Represents Boolean values: TRUE or FALSE.

Examples:

is_true <- TRUE

is_false <- FALSE

class(is_true)  # "logical"


e. Complex
  • Represents complex numbers.

Examples:

c <- 3 + 2i     # Complex number

class(c)        # "complex"


f. Raw
  • Represents raw bytes.

Examples:

r <- charToRaw("R programming")

class(r)        # "raw"


02. Data Structures (Composite Data Types)
a. Vector
  • In R programming language it is a one-dimensional collection of elements of the same type.
  • Created using the c() function.

Examples:

v <- c(1, 2, 3, 4)       # Numeric vector

char_v <- c("a", "b")    # Character vector

logical_v <- c(TRUE, FALSE) # Logical vector


b. Matrix
  • A two-dimensional collection of elements of the same type.
  • Created using the matrix() function.

Examples:

m <- matrix(1:9, nrow = 3, ncol = 3)


c. List
  • A collection of elements that can be of different types.

Examples:

l <- list(1, "a", TRUE)


d. Data Frame
  • In R programming language it is a two-dimensional table-like structure where each column can have different types.
  • Created using the data.frame() function.

Examples:

df <- data.frame(ID = 1:3, Name = c("Alice", "Bob", "Charlie"), Score = c(90, 85, 88))


e. Array
  • A multi-dimensional collection of elements of the same type.

Examples:

a <- array(1:12, dim = c(3, 2, 2))


03. Special Values

R programming language includes special values that represent certain states:

  • NA: Missing or undefined value.
  • NaN: "Not a Number" (e.g., a result of 0/0).
  • Inf and -Inf: Positive and negative infinity (e.g., result of 1/0 or -1/0).

Examples:

x <- NA          # Missing value

y <- NaN         # Not a Number

z <- 1 / 0       # Inf


04. Factors

Factors represent categorical data and are useful for statistical modeling.

Examples:

f <- factor(c("Male", "Female", "Male", "Female"))

levels(f)       # "Female", "Male"


05. Type Conversion

In R programming language you can convert between data types using functions like as.numeric(), as.character(), etc.

Examples:

x <- "42"

num_x <- as.numeric(x)   # Converts "42" to 42


3. Vectors

A vector is a collection of elements of the same type.

# Create a vector

vec <- c(1, 2, 3, 4, 5)

vec


4. Sequences

# Generate sequences

seq(1, 10, by=2)   # Sequence from 1 to 10 with step 2

1:10               # Sequence from 1 to 10


Control Structures

Control structures in R programming language allow you to control the flow of execution in your code. They help execute specific code blocks conditionally or repeatedly. Below are the main control structures in R:

a. if Statement

Executes a block of code if a condition is TRUE.

x <- 10

if (x > 5) {

  print("x is greater than 5")

}


b. if-else Statement

Adds an alternative block of code to execute when the condition is FALSE.

x <- 3

if (x > 5) {

  print("x is greater than 5")

} else {

  print("x is not greater than 5")

}


c. if-else Function

A vectorised version of if-else, returning values based on a condition.

x <- c(1, 2, 3, 4, 5)

result <- ifelse(x > 3, "Greater", "Smaller or Equal")

print(result)

# Output: "Smaller or Equal" "Smaller or Equal" "Smaller or Equal" "Greater" "Greater"


d. Nested if-else

In R programming language, nested if-else chains multiple conditions.

x <- 8

if (x > 10) {

  print("x is greater than 10")

} else if (x > 5) {

  print("x is greater than 5 but less than or equal to 10")

} else {

  print("x is 5 or less")

}


Looping Statements

a. for Loop

Iterates over a sequence or vector.

for (i in 1:5) {

  print(i)

}

Iterating over a character vector:

names <- c("Alice", "Bob", "Charlie")

for (name in names) {

  print(paste("Hello,", name))

}


b. while Loop

Repeats a block of code while a condition is TRUE.

x <- 1

while (x <= 5) {

  print(x)

  x <- x + 1

}


c. repeat Loop

In R programming language, an infinite loop must be explicitly exited with break.

x <- 1

repeat {

  print(x)

  x <- x + 1

  if (x > 5) {

    break

  }

}


Control Flow Commands

a. break

Exits the current loop prematurely.

for (i in 1:10) {

  if (i == 5) {

    break

  }

  print(i)

}


b. next

Skips the current iteration and proceeds to the next.

for (i in 1:5) {

  if (i == 3) {

    next

  }

  print(i)

}

 

Switch Statement in R programming Language

Evaluate an expression and select a case to execute.

x <- "b"

result <- switch(x,

                 "a" = "First case",

                 "b" = "Second case",

                 "c" = "Third case",

                 "Default case")

print(result)

# Output: "Second case"


Applying Control Structures to Functions

Control structures are often combined with functions for more complex logic. For example:

factorial <- function(n) {

  if (n == 0) {

    return(1)

  } else {

    result <- 1

    for (i in 1:n) {

      result <- result * i

    }

    return(result)

  }

}

 

factorial(5)  # Output: 120


Functions

Functions in R programming language are fundamental building blocks that allow you to organise, reuse, and modularise code. Here's a detailed overview:

1. Creating Functions

Functions in R are defined using the function keyword:

function_name <- function(arg1, arg2, ...) {

  # Function body

  return(value) # Optional: Returns the last evaluated value if omitted

}

Example:

add_numbers <- function(a, b) {

  result <- a + b

  return(result)

}

 

add_numbers(3, 5)  # Output: 8


2. Function Arguments
a. Required Arguments

All arguments in the function definition must be provided when calling the function unless default values are specified.

#Arguments in R programming language

multiply <- function(a, b) {

  return(a * b)

}

 

multiply(4, 5)  # Output: 20


b. Default Arguments

You can assign default values to arguments.

greet <- function(name = "User") {

  paste("Hello,", name)

}

 

greet()            # Output: "Hello, User"

greet("Alice")     # Output: "Hello, Alice"


c. Variable-Length Arguments (...)

The ... argument in R programming language allows passing a variable number of arguments.

sum_values <- function(...){

  return(sum(...))

}

 

sum_values(1, 2, 3, 4, 5)  # Output: 15


3. Return Values

Functions can return values explicitly using return() or implicitly by evaluating the last expression.

 

# Explicit return

square <- function(x) {

  return(x^2)

}

 

# Implicit return

cube <- function(x) {

  x^3

}

 

square(4)  # Output: 16

cube(3)    # Output: 27


4. Anonymous Functions

Anonymous functions are created without a name, often used as arguments to higher-order functions.

lapply(1:5, function(x) x^2)

# Output: List of squared numbers: 1, 4, 9, 16, 25


5. Nested Functions

Functions can be defined inside other functions. As well as in R programming language inner functions are only accessible within the outer function.

outer_function <- function(x) {

  inner_function <- function(y) {

    y^2

  }

  inner_function(x) + x

}

 

outer_function(3)  # Output: 12 (3^2 + 3)


6. Scope of Variables

R uses lexical scoping to resolve variables.

  • Local Scope: Variables defined inside a function are not accessible outside.
  • Global Scope: Variables defined outside functions are accessible within functions unless overridden.

x <- 10

 

scope_example <- function() {

  x <- 5  # Local variable

  return(x)

}

 

scope_example()  # Output: 5

x                # Output: 10 (Global variable remains unchanged)


7. Applying Functions
a. apply() Family of Functions

R programming language provides efficient alternatives to loops with the apply() family:

  • apply(): Apply a function to rows or columns of a matrix.
  • lapply(): Apply a function to each element of a list.
  • sapply(): Simplified version of lapply() that returns a vector or matrix.
  • tapply(): Apply a function to subsets of a vector grouped by another vector.
  • mapply(): Multivariate version of sapply().

Examples:

m <- matrix(1:9, nrow = 3)

apply(m, 1, sum)  # Row sums

apply(m, 2, sum)  # Column sums

 

lst <- list(a = 1:5, b = 6:10)

lapply(lst, sum)  # Sum of each list element

sapply(lst, sum)  # Output as a vector


8. Best Practices for Writing Functions
  • Use meaningful names for functions and arguments.
  • Include comments and documentation.
  • Avoid hardcoding values; use arguments for flexibility.
  • Test functions with different input scenarios.
9. Example: Combining Functions

Here is an example of using functions in R programming language with control structures and vectorisation:

calculate_grade <- function(score) {

  if (score >= 90) {

    return("A")

  } else if (score >= 80) {

    return("B")

  } else if (score >= 70) {

    return("C")

  } else if (score >= 60) {

    return("D")

  } else {

    return("F")

  }

}

 

scores <- c(95, 82, 67, 55)

grades <- sapply(scores, calculate_grade)

grades  # Output: "A" "B" "C" "F"


Working with Data

1. Data Frames

Data frames are used to store tabular data.

# Create a data frame

data <- data.frame(

  Name = c("Alice", "Bob", "Charlie"),

  Age = c(25, 30, 35),

  Score = c(90, 85, 88)

)

# Access columns

data$Name

 

# View the data frame

print(data)


2. Reading Data

# Read a CSV file

my_data <- read.csv("data.csv")

Visualisation

R programming language excels in data visualization with built-in plotting capabilities and libraries like ggplot2.

1. Basic Plot

# Simple plot

x <- c(1, 2, 3, 4, 5)

y <- c(2, 4, 6, 8, 10)

plot(x, y, type="o", col="blue")


2. Using ggplot2

# Install ggplot2 package

install.packages("ggplot2")

library(ggplot2)

 

# Example plot

ggplot(data, aes(x=Age, y=Score, color=Name)) +

  geom_point() +

  labs(title="Scores by Age")


Additional Resources

  • CRAN Task Views - Curated packages for specific tasks.
  • RStudio Cheatsheets - Handy guides for common R functions.
  • Books like R for Data Science by Hadley Wickham.
  • Data Science Machine Learning certification course by The IoT Academy.

What is R Programming Used For?

R programming language is commonly used in many fields, especially for working with data, statistics, and machine learning. So, here is a simple explanation of its uses:

  • Statistical Analysis: R helps analyze data using methods like regression, time series, and testing hypotheses.
  • Data Visualisation: It makes charts and graphs, like bar plots, scatter plots, and heatmaps, using tools like ggplot2.
  • Data Cleaning: R is great for organizing messy data using packages like dplyr and tidyr.
  • Machine Learning: You can build models for predicting and clustering data with tools like caret.
  • Bioinformatics: It’s used in biology to analyze genes and sequencing data with Bioconductor.
  • Finance: R also helps in analyzing stock portfolios and risks, and forecasting trends in business.
  • Research: Academics use R for running simulations and creating detailed reports or visuals.
  • Big Data: R works with large datasets and connects to tools like Apache Spark.
  • Web Apps: With Shiny, R can create interactive apps and dashboards for sharing insights.
  • Maps and Locations: R is used for working with geographic data, creating maps, and analyzing spatial information.

R is popular because it is free, has many libraries for different tasks, and works well with big datasets!

How Can I Learn R Programming Myself?

If you’re ready to learn R programming language, here are some steps you can take to get started:

1. Get Started with R

  • R is used for data analysis, statistics, and graph making.
  • Download R from CRAN and RStudio from Posit.
  • Learn the basics of how RStudio works.

2. Use Beginner Resources

  • Take online courses like:
    • DataCamp’s R courses.
  • Try interactive tools:
    • Swirl (teaches R inside RStudio).
  • Read books like "R for Data Science" (free at r4ds.had.co.nz).

3. Practice Writing Code

  • Start with simple programs of R programming language.
  • Learn about:
    • Data types like vectors, lists, and data frames.
    • Writing functions and using loops (e.g., for loops).
    • Importing and exporting data files.
  • Use example datasets like mtcars or iris to practice.

4. Learn Data Visualisation

  • Use the ggplot2 package to make beautiful charts.
  • Learn data manipulation using:
    • dplyr for filtering and summarising.
    • tidyr for organizing messy data.

5. Work on Simple Projects

  • Analyse data from public sources.
  • Create charts to show trends with ggplot2.
  • Perform basic statistics like averages or regression.

6. Join the R Community

  • Ask questions related to R programming language on forums like:
    • RStudio Community.
    • Stack Overflow (use the R tag).
    • Reddit: r/Rlanguage.
  • Explore projects on GitHub to learn from others.

7. Level Up

  • Try advanced topics like:
    • Machine learning with caret or tidymodels.
    • Building interactive apps with Shiny.
  • Read "Advanced R" by Hadley Wickham (free online).

8. Practice Consistently

  • Set aside regular time for practice.
  • Solve data challenges on Kaggle or similar platforms.

By practicing regularly and using these resources, you’ll become good at R programming step by step!

Conclusion

In conclusion, R programming language is a simple and powerful tool for working with data. It helps you analyze data, make charts, and solve problems with numbers. With tools like ggplot2 for making graphs and Shiny for creating web apps, R makes it easy to explore and share data. Whether you're working with small or large datasets, R can handle it all. Start with the basics, practice often, and use the many resources available to learn. So, mastering R can open up great career opportunities in data science and other fields. 

Frequently Asked Questions (FAQs)
Q. Is R Programming easy to learn?

Ans. R programming is easy to learn, especially for beginners who want to work with data. It has simple rules and lots of help available, like tutorials and community support.

Q. Which is better, Python or R?

Ans. Choosing between Python and R depends on what you need. Python is great for many tasks, while R programming language is best for doing statistics and creating data charts.