• BioSakshat
  • 1 Site navigation
  • 2 Data types and Data structures
    • 2.1 Lets Begin
    • 2.2 Expression
    • 2.3 Assignment
    • 2.4 Working directory
    • 2.5 List of objects in workspace/environment
    • 2.6 getting help in R
      • 2.6.1 Some things to Remember
    • 2.7 Data Types
    • 2.8 Data Structures
    • 2.9 Vector
      • 2.9.1 Creating a vector
      • 2.9.2 Create Vector of one element
      • 2.9.3 Create Vector using Range operator (:)
      • 2.9.4 Create Vector using c()
      • 2.9.5 seq() function
      • 2.9.6 rep() function
      • 2.9.7 sample() function
      • 2.9.8 rnorm() function
      • 2.9.9 Fetching elements from a vector
      • 2.9.10 Delete element(s) from a vector
      • 2.9.11 Add element(s) to existing vector
      • 2.9.12 Replace elements in existing vector
      • 2.9.13 Inbuilt functions for numeric vector
      • 2.9.14 Correlation using cor()
      • 2.9.15 Set operations
      • 2.9.16 Arithmetic expressions
      • 2.9.17 Arithmetic operations on vector
      • 2.9.18 Operator precedence
      • 2.9.19 Conditional statements
      • 2.9.20 Check for missing values
      • 2.9.21 Check data types and data structures
      • 2.9.22 Implicit Data type conversion
      • 2.9.23 Character vector and related functions
      • 2.9.24 Explicit Data Type Conversion
    • 2.10 Matrix
      • 2.10.1 Create a matrix from vector using dim attributes
      • 2.10.2 Create matrix using matrix()
      • 2.10.3 Functions that can be applied to a matrix
      • 2.10.4 Assign column/row names
      • 2.10.5 Fetch elements from a matrix using indices
      • 2.10.6 Fetch element using index matrix
      • 2.10.7 Fetch elements from a matrix using row and column names
      • 2.10.8 Insert rows and columns by rbind() and cbind()
      • 2.10.9 Create matrix using rbind() and cbind()
      • 2.10.10 Delete row(s) and column(s)
      • 2.10.11 Check for matrix
      • 2.10.12 Conditional statements on matrix
    • 2.11 Data Frame
      • 2.11.1 Create a data frame
      • 2.11.2 Fetch values from a data frame
      • 2.11.3 Insert column to data frame
      • 2.11.4 Delete column or row from data frame
      • 2.11.5 Check for data frame
    • 2.12 List
      • 2.12.1 Access component of list using $ notation
      • 2.12.2 Access component of list using []
  • 3 If/For/Function/IO
    • 3.1 Conditional statements
    • 3.2 Repetitive execution using for loop
      • 3.2.1 Task
    • 3.3 Functions
      • 3.3.1 Define function
      • 3.3.2 Call function
    • 3.4 File IO
      • 3.4.1 The read.table() function
      • 3.4.2 Read data
      • 3.4.3 About header=TRUE parameter
      • 3.4.4 About stringsAsFactors = FALSE parameter
      • 3.4.5 Other useful parameters:
      • 3.4.6 Read file with unequal columns in first row using read.table.
      • 3.4.7 Read file consisting of comments, blank lines, null values
      • 3.4.8 Read csv file read.csv()
      • 3.4.9 Reading data from excel file
      • 3.4.10 Read data using readLines()
      • 3.4.11 Read data using clipboard feature
      • 3.4.12 View data frame using View()
      • 3.4.13 Edit data frame using edit()
      • 3.4.14 Write R data frames in a file using write.table()
      • 3.4.15 Write using cat()
  • 4 Data Manipulation
    • 4.1 The apply() function
    • 4.2 sapply() function
    • 4.3 lapply() function
    • 4.4 dplyr package
      • 4.4.1 Load dplyr library
      • 4.4.2 Load Gene Expression Data
      • 4.4.3 Filter rows with filter()
      • 4.4.4 Piping using %>%
      • 4.4.5 Arrange rows with arrange()
      • 4.4.6 Select rows position wise using slice()
      • 4.4.7 Select column using select()
      • 4.4.8 Extract unique rows using distinct()
      • 4.4.9 Add new columns with mutate() and transmute() function
      • 4.4.10 Summarise values with summarise()
      • 4.4.11 Grouped operations
      • 4.4.12 Joins two table
      • 4.4.13 References
  • 5 Data Visualization
    • 5.1 Base Graphics
      • 5.1.1 Simple scatter plot
      • 5.1.2 Saving plot
      • 5.1.3 Add title/subtitle/x-axis label/y-axis label to plot
      • 5.1.4 Increase line thickness
      • 5.1.5 Assign color to lines and points
      • 5.1.6 Explore built-in colors
      • 5.1.7 Explore different symbols
      • 5.1.8 Explore different line types
      • 5.1.9 Rescale the x-axis and y-axis
      • 5.1.10 Magnification of labels and symbols
      • 5.1.11 Explore bty (box type)
      • 5.1.12 Explore las (style of axis labels)
      • 5.1.13 Explore col.axis, col.lab, col.main, col.sub parameters
      • 5.1.14 Low level plotting functions (lines)
      • 5.1.15 Low level plotting functions (legend)
      • 5.1.16 Adding texts to existing plot (text)
      • 5.1.17 Explore Line Plot
      • 5.1.18 Explore simple Bar plot
      • 5.1.19 Explore stacked bar plot
      • 5.1.20 Grouped Bar plots
      • 5.1.21 Histogram
      • 5.1.22 Density plot
      • 5.1.23 Plot density plot of two distributions
      • 5.1.24 Box plots
      • 5.1.25 Pie chart
      • 5.1.26 Venn diagram
      • 5.1.27 Heatmap
      • 5.1.28 Scatter plot Matrices
      • 5.1.29 Ballon plot
      • 5.1.30 Multi panel plots
      • 5.1.31 PCA (Prinicipal Component Analysis)
      • 5.1.32 Classical (Metric) Multidimensional Scaling
      • 5.1.33 Plotting K-means
      • 5.1.34 Dendrogram
      • 5.1.35 Network graphs
    • 5.2 Plot using ggplot2
      • 5.2.1 Scatter plot
      • 5.2.2 Add geom_point() layer
      • 5.2.3 Add geom_smooth() layer, linear modeling
      • 5.2.4 Explore aesthetic parameter “col”
      • 5.2.5 Assign aes() to individual layer
      • 5.2.6 Explore aesthetic parameter “shape”
      • 5.2.7 Add axis lables and plot title using labs()
      • 5.2.8 Change color pelette
      • 5.2.9 Save the ggplot object and then print.
      • 5.2.10 The Theme
      • 5.2.11 Adjusting the legend title
      • 5.2.12 Facet_wrap
      • 5.2.13 Bar charts
      • 5.2.14 Density plot
      • 5.2.15 Box plot
      • 5.2.16 References
  • 6 Bioconductor
    • 6.1 Introduction
      • 6.1.1 Install
      • 6.1.2 Explore Bioconductor Tutorials
      • 6.1.3 Explore course and conferences materials
      • 6.1.4 Explore and Search for package using BiocViews
      • 6.1.5 Bioconductor Forum
    • 6.2 Explore Maftools
      • 6.2.1 About MAFtools
      • 6.2.2 About MAF file
      • 6.2.3 Install maftools.
      • 6.2.4 Load maftools library
      • 6.2.5 Read example maf file
      • 6.2.6 Print maf object
      • 6.2.7 Structure of maf object
      • 6.2.8 Shows sample summry.
      • 6.2.9 Show frequently mutated genes.
      • 6.2.10 Shows all fields in MAF
      • 6.2.11 Plotting MAF summary
      • 6.2.12 Oncoplots
      • 6.2.13 Transition and Transversions.
    • 6.3 Explore cummeRbund for Diffrential Gene Expression Analysis
    • 6.4 References
  • 7 R Case study and Tasks
    • 7.1 Case study1: Gene Expression Data Analysis
      • 7.1.1 Experimental setup
      • 7.1.2 Objective
      • 7.1.3 Steps
    • 7.2 Case study1: Solution
    • 7.3 Tasks
      • 7.3.1 Vector creation
      • 7.3.2 Fetching vector elements
      • 7.3.3 Vector manipulation
      • 7.3.4 Vector arithmetic
      • 7.3.5 Matrix
      • 7.3.6 Data Frame
      • 7.3.7 Tasks on Iris data set
      • 7.3.8 Visualization
    • 7.4 Solutions
      • 7.4.1 Vector creation
      • 7.4.2 Fetching vector elements
      • 7.4.3 Vector manipulation
      • 7.4.4 Vector arithmetic
      • 7.4.5 Matrix
      • 7.4.6 Data Frame
      • 7.4.7 Tasks on iris data set
      • 7.4.8 Visualization
  • 8 Descriptive statistics
    • 8.1 Measure of Centrality
    • 8.2 Measure of Spread
    • 8.3 Handle missing values
      • 8.3.1 Shape / Data Distribution
    • 8.4 Estimate Skewness and Kurtosis
    • 8.5 Further with Skewness and Kurtosis.
  • 9 Data Distributions
    • 9.1 Normal distribution
    • 9.2 Effect of mean and sd parameter on normal distribution
    • 9.3 Effect of n (sample size) on normal distribution
    • 9.4 Use z score to calculate percentile (area below or lower tail)
    • 9.5 Use z score to calculate Upper tail
    • 9.6 Given percentile (area below the curve or lower tail area), find the z score
    • 9.7 Explore the applet
    • 9.8 Evaluating the normal distribution
    • 9.9 Sampling distribution
    • 9.10 Standard Error
    • 9.11 Confidence interval
    • 9.12 Hypothesis test for single sample mean
    • 9.13 One tailed test (Greater)
    • 9.14 One tailed test (Lesser)
    • 9.15 Hypothesis test for two sample mean, unpaired
    • 9.16 Hypothesis test for two sample mean, paired
    • 9.17 Check for normality
    • 9.18 Check for uniform variance
    • 9.19 Hypothesis test for single sample proportion
  • 10 Regression
    • 10.1 Simple linear Regression
      • 10.1.1 Build model
      • 10.1.2 Obtain model summary (Linear Regression Diagnostics)
      • 10.1.3 Model coefficients
      • 10.1.4 Model significance and Coefficient significance
      • 10.1.5 Coefficient of Determination or R-squared
      • 10.1.6 Standard Error and F-Statistic
      • 10.1.7 AIC and BIC
      • 10.1.8 Residuals and Residual plot
      • 10.1.9 How to know if the model is best fit for your data?
      • 10.1.10 Lets predict a new dataset using the built model
    • 10.2 Multiple Linear Regression
      • 10.2.1 Build multiple regression model
  • 11 Clustering
    • 11.1 Data preparation
      • 11.1.1 Load library
      • 11.1.2 Load TCGA data
      • 11.1.3 Vizualize data
      • 11.1.4 Data preparation
      • 11.1.5 Check for NA
      • 11.1.6 Scale the data
      • 11.1.7 Vizualize data after scaling
    • 11.2 Clustering
    • 11.3 K-means clustering
      • 11.3.1 Visualizing k-means clusters in 2D space
      • 11.3.2 Get details of clusters
    • 11.4 Determining The Optimal Number Of Clusters
      • 11.4.1 Elbow method
      • 11.4.2 Average silhouette method
      • 11.4.3 Gap statistic method
      • 11.4.4 NbClust() function: 30 indices for choosing the best number of clusters
  • 12 PCA
    • 12.1 About Principal Component Analysis (PCA
    • 12.2 Data preparation
      • 12.2.1 Load Gene expression data from TCGA Breast cancer
      • 12.2.2 Data standardization
    • 12.3 Perform PCA
    • 12.4 Visualization and Interpretation
    • 12.5 Eigenvalues / Variances / Number of PCs to consider
    • 12.6 Graph of variables
    • 12.7 Correlation circle
    • 12.8 Quality of representation
    • 12.9 Referece
  • 13 Network analysis
    • 13.1 Data processing
      • 13.1.1 Load iGraph Library
      • 13.1.2 Load data
    • 13.2 Convert data frame to graph
    • 13.3 Node and Edge details
    • 13.4 Plot Parameters
    • 13.5 Network layouts
    • 13.6 Network and node descriptives
      • 13.6.1 Edge density
      • 13.6.2 Diameter
    • 13.7 Centrality & centralization
      • 13.7.1 Degree centrality
      • 13.7.2 Closeness
      • 13.7.3 Betweenness
    • 13.8 Hubs and authorities
    • 13.9 Distances and paths
    • 13.10 Cliques
    • 13.11 Community detection
    • 13.12 Interative network
  • Visit site

BioSakshat - Free Study Materials

BioSakshat - Free Study Materials

By Priyabrata Panigrahi and Pandurang Kolekar

2019-08-13

Chapter 1 Site navigation

This site contains study materials for our(BioSakshat) training programs. The study material is available freely.

http://biosakshat.weebly.com/



Course Content

  • Data types and Data structures (Vector, matrix, list, data frames)
  • Learn about if/for/functions/file input-output
  • Data manipulations (Learn about dplyr package, apply set of functions)
  • Data visualizations (Learn about both base graphics and ggplot2)
  • Learn about Bioconductor
  • Do some case study, tasks
  • Learn about descriptive statistics
  • Learn about different statistical distributions
  • Learn about hypothesis testing
  • Learn about Regression
  • Clustering
  • PCA
  • Network analysis