天天看點

geotrellis-栅格資料切片前言步驟總結

geotrellis-栅格資料切片

  • 前言
  • 步驟
    • 1.建立maven項目
    • 2.pom中引入需要的庫
    • 3.編碼
  • 總結

前言

版本描述:geotrellis 2.3.3、scala:2.11.12、java 1.8 、hadoop3.0.0

目标:使用geotrellis中的pipeline子產品進行栅格資料切片

步驟

1.建立maven項目

建立一個普通的maven java項目即可

2.pom中引入需要的庫

<properties>
        <geotrellis.version>2.3.3</geotrellis.version>
        <scala.version>2.11</scala.version>
        <spark.version>2.4.0</spark.version>
    </properties>

    <dependencies>
        <!-- https://mvnrepository.com/artifact/org.locationtech.geotrellis/geotrellis-raster -->
        <dependency>
            <groupId>org.locationtech.geotrellis</groupId>
            <artifactId>geotrellis-raster_${scala.version}</artifactId>
            <version>${geotrellis.version}</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.locationtech.geotrellis/geotrellis-spark -->
        <dependency>
            <groupId>org.locationtech.geotrellis</groupId>
            <artifactId>geotrellis-spark_${scala.version}</artifactId>
            <version>${geotrellis.version}</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.locationtech.geotrellis/geotrellis-proj4 -->
        <dependency>
            <groupId>org.locationtech.geotrellis</groupId>
            <artifactId>geotrellis-proj4_${scala.version}</artifactId>
            <version>${geotrellis.version}</version>
        </dependency>
        <!-- https://mvnrepository.com/artifact/org.locationtech.geotrellis/geotrellis-spark-pipeline -->
        <dependency>
            <groupId>org.locationtech.geotrellis</groupId>
            <artifactId>geotrellis-spark-pipeline_${scala.version}</artifactId>
            <version>${geotrellis.version}</version>
        </dependency>
        <dependency>
            <groupId>org.apache.spark</groupId>
            <artifactId>spark-core_${scala.version}</artifactId>
            <version>${spark.version}</version>
        </dependency>
    </dependencies>
<build>
        <plugins>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-shade-plugin</artifactId>
                <executions>
                    <execution>
                        <phase>package</phase>
                        <goals>
                            <goal>shade</goal>
                        </goals>
                        <configuration>
                            <filters>
                                <filter>
                                    <artifact>*:*</artifact>
                                    <excludes>
                                        <exclude>META-INF/*.SF</exclude>
                                        <exclude>META-INF/*.DSA</exclude>
                                        <exclude>META-INF/*.RSA</exclude>
                                    </excludes>
                                </filter>
                            </filters>
                            <transformers>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                    <mainClass>core.OffLineRecommendHandle</mainClass>
                                </transformer>
                                <transformer
                                        implementation="org.apache.maven.plugins.shade.resource.AppendingTransformer">
                                    <resource>reference.conf</resource>
                                </transformer>
                            </transformers>
                        </configuration>
                    </execution>
                </executions>
            </plugin>
            <plugin>
                <groupId>org.scala-tools</groupId>
                <artifactId>maven-scala-plugin</artifactId>
                <version>2.15.2</version>
                <executions>
                    <execution>
                        <goals>
                            <goal>compile</goal>
                            <goal>testCompile</goal>
                        </goals>
                    </execution>
                </executions>
            </plugin>
        </plugins>
    </build>

           

以上還增加了兩個plugin,第一個是(maven-shade-plugin)是為了打包使用到,第二個是(maven-scala-plugin)java項目中編譯scala支援插件。

3.編碼

我這裡因為是要批量操作是以将輸入輸出作為參數傳進來,pipeline是根據json的描述一步一步進行執行的,上一步的結果給下一步使用。

def insertTileToHdfs(implicit sc: SparkContext,input:String,output:String,layerName:String) = {
    val maskJson: String =
      """
        |[
        |  {
        |    "uri" : "{input}",
        |    "type" : "singleband.spatial.read.hadoop"
        |  },
        |  {
        |    "resample_method" : "nearest-neighbor",
        |    "type" : "singleband.spatial.transform.tile-to-layout"
        |  },
        |  {
        |    "crs" : "EPSG:3857",
        |    "scheme" : {
        |      "crs" : "epsg:3857",
        |      "tileSize" : 256,
        |      "resolutionThreshold" : 0.1
        |    },
        |    "resample_method" : "nearest-neighbor",
        |    "type" : "singleband.spatial.transform.buffered-reproject"
        |  },
        |  {
        |    "end_zoom" : 0,
        |    "resample_method" : "nearest-neighbor",
        |    "type" : "singleband.spatial.transform.pyramid"
        |  },
        |  {
        |    "name" : "{layerName}",
        |    "uri" : "{output}",
        |    "key_index_method" : {
        |      "type" : "zorder"
        |    },
        |    "scheme" : {
        |      "crs" : "epsg:3857",
        |      "tileSize" : 256,
        |      "resolutionThreshold" : 0.1
        |    },
        |    "type" : "singleband.spatial.write"
        |  }
        |]
      """.stripMargin

    val maskJsonStr = maskJson.replace("{input}",input).replace("{output}",output).replace("{layerName}",layerName)

    // parse the JSON above
    val list: Option[Node[Stream[(Int, TileLayerRDD[SpatialKey])]]] = maskJsonStr.node

    list match {
      case None => println("Couldn't parse the JSON")
      case Some(node) => {
        // eval evaluates the pipeline
        // the result type of evaluation in this case would ben Stream[(Int, TileLayerRDD[SpatialKey])]
        node.eval.foreach { case (zoom, rdd) =>
          println(s"ZOOM: ${zoom}")
          println(s"COUNT: ${rdd.count}")
        }
      }
    }

  }

           

主方法調用

def main(args: Array[String]): Unit = {
    val conf =
      new SparkConf()
       // .setMaster("local")   //本機跑起來這裡就設定,要放到叢集中跑就不用,後面打成jar包  spark-submit的時候在指定  --master yarn
        .setAppName("Spark Tiler")
        .set("spark.serializer", "org.apache.spark.serializer.KryoSerializer")
        .set("spark.kryo.registrator", "geotrellis.spark.io.kryo.KryoRegistrator")
    implicit val sc: SparkContext = SparkUtils.createSparkContext("IngestTestRasterSourceToHadoop",conf)

	//tif資料源使用本地磁盤寫如   “file:///d:/data/”
    var sour = "file:/d:/data/atmors/sour/FJ-PM25/PM25-layer1.tif"
    //瓦片儲存位址,也是儲存到hdfs中
    var layers = "file:/d:/data/atmors/tiles/pm25/"
    insertTileToHdfs(sc,input,layers,layername)
  }
           

結果:

geotrellis-栅格資料切片前言步驟總結

總結

1.這種方式進行切片入庫少了很多代碼,比之前的方式好很多。

2.pipeline的文檔可以看官方文檔https://geotrellis.readthedocs.io/en/v3.5.1/guide/pipeline.html

3.需要多看文檔,多了解,多嘗試,别放棄,過程中可能會遇到pom依賴,導包、等各種問題,需要耐心,總有辦法,無非就是多看點文檔,别急。

4.歡迎互相學習,交流讨論,本人的微信:huangchuanxiaa。

繼續閱讀