天天看點

R fundamentals 3:data frame, matrix and arraydataframematrixarray

文章目錄

  • dataframe
    • data.frame() to create a dataframe
      • data frame is like a spreadsheet with row numbers and column names
    • common operation on data frame
      • subsetting: extract element(s)
        • use index numbers to extract elements
          • single bracket to return elements of the same type
          • double bracket to return elements of its own type
        • use column names to extract elements
        • use dollar sign "$" to extract elements
        • use logical value to extract elements
      • get elements matching specific conditions
  • matrix
    • 2 ways to construct a matrix
      • use cbind() or rbind() to create a matrix
    • use matrix to construct a matrix
    • common operations on matrice
      • subsetting:extract elements
    • summary operations about matrice in DA
      • get the row wise sum via rowSums() not rowsum()
      • also get the column wise sum via colSums() not colsum()
      • get the column wise mean via colMeans() not colmean()
  • array
    • use array() to create an array
    • subsetting
      • use index number to extract elements

dataframe

  • heterogeneous data structure
  • contains elements of different classes
  • 2 dimensional arrangement
    players.name=c("KD","Curry","Klay","Green")
      players.number=c(35,30,11,23)
      players.2K=c(87,96,91,85)
      players.gender=factor(c("male","male","male","male"),levels = c("male","female"))
               

data.frame() to create a dataframe

players=data.frame(players.name,players.number,players.gender,players.2K)
str(players)
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

notice the string type players.names are converted to factor

However we can fix it by adding parameter stringsAsFactors=False

make sure not typing as stringAsFactors

players=data.frame(players.name,players.number,players.gender,players.2K,stringsAsFactors=FALSE)
str(players)
players
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

data frame is like a spreadsheet with row numbers and column names

common operation on data frame

subsetting: extract element(s)

4 ways

use index numbers to extract elements

single bracket to return elements of the same type
players[2]
#single number in the brackets means column number
typeof(players[2])
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

data frame is basically a list

double bracket to return elements of its own type
players[[2]]
typeof(players[[2]])
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

2 numbers in the bracket means row and column number

#for example:below means extract the element at row1 and column2

players[1,2]

R fundamentals 3:data frame, matrix and arraydataframematrixarray

some other examples

players[1:3,2:3]
players[c(1,3),c(2,4)]
players[1,]
players[,1]
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray
R fundamentals 3:data frame, matrix and arraydataframematrixarray

use column names to extract elements

players["players.name"]
typeof(players["players.name"])
players[["players.name"]]
typeof(players[["players.name"]])
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

single bracket to return elements of the same type

double bracket to return elements of its own type

players[c("players.name","players.gender")]
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

use dollar sign “$” to extract elements

players$players.name
           

use logical value to extract elements

players[c(T,F,F,F)]
           

get elements matching specific conditions

players[players.2K>90,]
players[players.gender="male"]
players[players.gender=="male"]
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

be sure to use “==” instead of “=” to do logical judgement

vector, list and data frame are the most common data structures in DA

matrix

  • homogeneous data structure to store elements of the same type
  • 2D data arrangement
  • usually the matrix is used to store numerical elements for computing

    student.English.score=c(100L,98L,99L,96L)

    student.math.score=c(10L,11L,9L,15L)

2 ways to construct a matrix

use cbind() or rbind() to create a matrix

  • rbind() means row bind
  • row names are given by vector names and column names are autogenerated

    student.score1=rbind(student.math.score,student.English.score)

    student.score1

    str(student.score1)

R fundamentals 3:data frame, matrix and arraydataframematrixarray

column names can be given afterwards by colnames()

colnames(student.score1)=c("KD","Curry","Klay","Green")

student.score1
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray
  • cbind means column bind
  • column names are given

codes:

student.score2=cbind(student.math.score,student.English.score)
student.score2
str(student.score2)
rownames(student.score2)=c("KD","Curry","Klay","Green")
student.score2
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

use matrix to construct a matrix

student.score3=matrix(c(10L,11L,9L,15L,100L,98L,99L,96L),ncol=2,nrow=4)
student.score3
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray
  • generate the matrix by listing elements and give the row and column numbers
  • in default, the matrix is generated vertically
  • we can generate the matrix horizonally by adding "byrow=true"

    codes:

    student.score4=matrix(c(10L,11L,9L,15L,100L,98L,99L,96L),ncol=4,nrow=2,byrow=TRUE)

    student.score4

    R fundamentals 3:data frame, matrix and arraydataframematrixarray

common operations on matrice

student.score2
           

subsetting:extract elements

  • subsetting of matrice is simliar to that of dataframe
  • matrice use index number or logical value to extract elements

codes:

student.score2[4,2]
student.score2[1,]
student.score2[c(T,T,F,F),]
student.score2[student.math.score>10,]
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

summary operations about matrice in DA

student.score2
           

get the row wise sum via rowSums() not rowsum()

rowSums(student.score2)
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

also get the column wise sum via colSums() not colsum()

colSums(student.score2)
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

get the column wise mean via colMeans() not colmean()

colMeans(student.score2)
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

array

-homogeneous data structure

-n- dimensional

-array can be regarded as multiple sheets or multiple sheets stacj over each other

-a matrix is like a sheet as we discuss above

#create first matrix
class1.student.math=c(100,99,87,96)
class1.student.English=c(90,98,90,87)
class1.student.marks=cbind(class1.student.math,class2.student.English)
class1.student.marks
#create the second matrix
class2.student.math=c(91,93,91,92)
class2.student.English=c(98,92,80,78)
class2.student.marks=cbind(class2.student.math,class2.student.English)
class2.student.marks
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

use array() to create an array

student.marks=array(c(class1.student.marks,class2.student.marks),dim=c(4,2,2))
student.marks
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray

subsetting

use index number to extract elements

student.marks[1,,1]
           
R fundamentals 3:data frame, matrix and arraydataframematrixarray