hotspot_cluster: Spatiotemporal clustering of hot spot data

View source: R/hotspot_cluster.R

hotspot_clusterR Documentation

Spatiotemporal clustering of hot spot data

Description

This is the main function of the package.
This function clusters hot spots into fires. It can be used to reconstruct fire history and detect fire ignition points.

Usage

hotspot_cluster(
  hotspots,
  lon = "lon",
  lat = "lat",
  obsTime = "obsTime",
  activeTime = 24,
  adjDist = 3000,
  minPts = 4,
  minTime = 3,
  ignitionCenter = "mean",
  timeUnit = "n",
  timeStep = 1
)

Arguments

hotspots

List/Data frame. A list or a data frame which contains information of hot spots.

lon

Character. The name of the column of the list which contains numeric longitude values.

lat

Character. The name of the column of the list which contains numeric latitude values.

obsTime

Character. The name of the column of the list which contains the observed time of hot spots. The observed time has to be in date, datetime or numeric.

activeTime

Numeric (>=0). Time tolerance. Unit is time index.

adjDist

Numeric (>0). Distance tolerance. Unit is metre.

minPts

Numeric (>0). Minimum number of hot spots in a cluster.

minTime

Numeric (>=0). Minimum length of time of a cluster. Unit is time index.

ignitionCenter

Character. Method to calculate ignition points, either "mean" or "median".

timeUnit

Character. One of "s" (seconds), "m" (minutes), "h" (hours), "d" (days) and "n" (numeric).

timeStep

Numeric (>0). Number of units of timeUnit in a time step.

Details

Arguments timeUnit and timeStep need to be specified to convert date/datetime/numeric to time index. More details can be found in transform_time_id().

This clustering algorithm consisted of 5 steps:

In step 1, it defines T intervals using the time index

Interval(t) = [max(1, t - activeTime),t]

where t = 1, 2, ..., T, and T is the maximum time index. activeTime is an argument that needs to be specified. It represents the maximum time difference between two hot spots in the same local cluster. Please notice that a local cluster is different with a cluster in the final result. More details will be given in the next part.

In step 2, the algorithm performs spatial clustering on each interval. A local cluster is a cluster found in an interval. Argument adjDist is used to control the spatial clustering. If the distance between two hot spots is smaller or equal to adjDist, they are directly-connected. If hot spot A is directly-connected with hot spot B and hot spot B is directly-connected with hot spot C, hot spot A, B and C are connected. All connected hot spots become a local cluster.

In step 3, the algorithm starts from interval 1. It marks all hot spots in this interval and records their membership labels. Then it moves on to interval 2. Due to a hot spot could exist in multiple intervals, it checks whether any hot spot in interval 2 has been marked. If there is any, their membership labels will be carried over from the record. Unmarked hot spots in interval 2, which share the same local cluster with marked hot spots, their membership labels are carried over from marked hot spots. If a unmarked hot spot shares the same local cluster with multiple marked hot spots, the algorithm will carry over the membership label from the nearest one. All other unmarked hot spots in interval 2 that do not share the same cluster with any marked hot spot, their membership labels will be adjusted such that the clusters they belong to are considered to be new clusters. Finally, all hot spots in interval 2 are marked and their membership labels are recorded. This process continues for interval 3, 4, ..., T. After finishing step 3, all hot spots are marked and their membership labels are recorded.

In step 4, it checks each cluster. If there is any cluster contains less than minPts hot spots, or lasts shorter than minTime, it will not be considered to be a cluster any more, and their hot spots will be assigned with -1 as their membership labels. A hot spot with membership label -1 is noise. Arguments minPts and minTime need to be specified.

In step 5, the algorithm finds the earliest observed hot spots in each cluster and records them as ignition points. If there are multiple earliest observed hot spots in a cluster, the mean or median of the longitude values and the latitude values will be used as the coordinate of the ignition point. This needs to be specified in argument ignitionCenter.

Value

A spotoroo object. The clustering results. It is also a list:

  • hotspots : A data frame contains information of hot spots.

    • lon : Longitude.

    • lat : Latitude.

    • obsTime : Observed time.

    • timeID : Time index.

    • membership : Membership label.

    • noise : Whether it is a noise point.

    • distToIgnition : Distance to the ignition location.

    • distToIgnitionUnit : Unit of distance to the ignition location.

    • timeFromIgnition : Time from ignition.

    • timeFromIgnitionUnit : Unit of time from ignition.

  • ignition : A data frame contains information of ignition points.

    • lon : Longitude.

    • lat : Latitude.

    • obsTime : Observed time.

    • timeID : Time index.

    • obsInCluster : Number of observations in the cluster.

    • clusterTimeLen : Length of time of the cluster.

    • clusterTimeLenUnit : Unit of length of time of the cluster.

  • setting : A list contains the clustering settings.

Examples



  # Time consuming functions (>5 seconds)


  # Get clustering results
  result <- hotspot_cluster(hotspots,
                lon = "lon",
                lat = "lat",
                obsTime = "obsTime",
                activeTime = 24,
                adjDist = 3000,
                minPts = 4,
                minTime = 3,
                ignitionCenter = "mean",
                timeUnit = "h",
                timeStep = 1)

  # Make a summary of the clustering results
  summary(result)

  # Make a plot of the clustering results
  plot(result, bg = plot_vic_map())



TengMCing/spotoroo documentation built on Nov. 21, 2024, 4:17 a.m.