今天参加了一个SQL的Workshop,本来早上九点实在是对我有点早想着不来的,不过发现并没有用过AWS,就还是来看一看。这里是这次Workshop的Github连接:Learning Analytics Curriculum & Teaching
详情还是参见Github,这里仅为简单罗列重点内容,有问题可以私信!
用R连接的好处就是可以直接在R里面操作数据库并进行数据分析,不需要到处.csv的中间文件,然后再加载进R的工作空间,也算是效率上的小提升吧!
拥有一个Amazon Web Services(AWS)账号
注册地址:https://aws.amazon.com/
*注:需要信用卡,但是以下操作皆免费,也许需要科学上网。
在AWS上创建一个MySQL的实例
- 登录你的AWS账号
- 找到
→ \rightarrow →Services
→ \rightarrow →Database
RDS
- 点击
Create Database
-
下选择Choose a database creation method
Standard Create
-
里选择Engine options
MySQL
-
选择Templates
Free tier
-
中设置Settings
,如DB instance identifier
sqltest
-
下的Settings
中设置Credential settings
和username
password
- 将Under
中的Connectivity
里的Additional connectivity configuration
勾选为Publicly accessible
Yes
- 在
里设置Additional configuration
,如database name
testdb
- 不要勾选
Automatic backups
- 点击
Create database
设置Security Group
- 在
里点击Security Groups
然后点击Inbound
Edit
- 点击
选择Add Rule
修改SQL/Aurora
为Source
My IP
用R连接AWS上的数据库
- DBI: A database interface definition for communication between R and relational database management systems. All classes in this package are virtual and need to be extended by the various R/DBMS implementations.
- RMySQL: Legacy ‘DBI’ interface to ‘MySQL’ / ‘MariaDB’ based on old code ported from S-PLUS. A modern ‘MySQL’ client based on ‘Rcpp’ is available from the ‘RMariaDB’ package.
library(DBI)
library(RMySQL)
db_user <- 'admin'
db_password <- 'your_password'
db_name <- 'testdb'
# 这个是RDS=>Databases=>Connectivity&security里面的Endpoint
db_host <- 'sqltest.c24zvetb1ohx.us-east-2.rds.amazonaws.com'
db_port <- 3306
mydb <- dbConnect(MySQL(), user = db_user, password = db_password, dbname = db_name, host = db_host, port = db_port)
summary(mydb)