Florida’s Coral Reef Water Quality Data Compilation (FCRWQDC)

Data ingestion and initial analysis from FL WIN water quality database and other sources. Below are some statistics on data across all analytes and programs. For more information on specific analytes and providers, see the analyte reports and provider reports.

Data for each analyte can be downloaded from the relevant analyte report. The full compiled data can be downloaded from the University of South Florida here

Florida Coral Reef Water Quality Database Compilation (FCRWQDC). This work is a product of the University of South Florida Institute for Marine Remote Sensing (IMaRS), funded by the Florida Department of Environmental Protection (FDEP).

get data across all programs
library("here")
source(here("R/getAllData.R"))
df <- getAllData()
=== LOADING PROVIDER : SFER...
[1] "WARN - rows found with no location ID"
=== LOADING PROVIDER : MiamiBeach...
=== LOADING PROVIDER : BBWW...
=== LOADING PROVIDER : FIU_Estuaries...
=== LOADING PROVIDER : AOML_FBBB...
=== LOADING PROVIDER : BBAP...
=== LOADING PROVIDER : BROWARD...
=== LOADING PROVIDER : DEP...
=== LOADING PROVIDER : DERM_BBWQ...
=== LOADING PROVIDER : FIU_WQMP...
=== LOADING PROVIDER : PALMBEACH...
create .csv of all data
source(here("R/mutateWINTo2025.R"))
# reduce to only cols we need & save to csv
write.csv(mutateWINTo2025(df), here("data", "exports", "allData.csv"))

List of Analytes:

list all analytes
print(unique(df$DEP.Analyte.Name))
 [1] "Temperature"             "Salinity"               
 [3] "Dissolved_Oxygen"        "Ammonium"               
 [5] "Nitrite"                 "Nitrate"                
 [7] "Nitrate+Nitrite"         "Orthophosphate"         
 [9] "Silicate"                "Chlorophyll_a"          
[11] "Pheophytin"              "pH"                     
[13] "Specific_Conductivity"   "Turbidity"              
[15] "Total_Kjeldahl_Nitrogen" "Phosphorus"             
[17] "Total_Nitrogen"          "Fecal_Coliforms"        
[19] "Enterococci"             "Ammonia"                
[21] "Ammonia+Ammonium"       

Overall statistics:

skimr on all data
library(skimr)
skim(df)
Data summary
Name df
Number of rows 1422618
Number of columns 126
_______________________
Column type frequency:
character 88
Date 1
logical 8
numeric 29
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
keyfield 1329448 0.07 24 43 0 8468 0
Activity.ID 1005946 0.29 5 36 0 86685 0
time 1329448 0.07 5 5 0 1423 0
Monitoring.Location.ID 22 1.00 0 25 23721 2003 0
station_type 1329470 0.07 1 1 0 2 0
depth_class 1329470 0.07 3 7 0 3 0
depth_order 1329470 0.07 4 5 0 12 0
notes 1422530 0.00 1 75 0 8 11
DEP.Analyte.Name 0 1.00 2 23 0 21 0
RowID 1399577 0.02 6 7 0 23041 0
ProgramID 1398667 0.02 4 4 0 2 0
Habitat 1078890 0.24 12 21 0 5 0
IndicatorID 1398667 0.02 1 1 0 3 0
IndicatorName 1398667 0.02 9 13 0 3 0
ParameterID 1398667 0.02 1 2 0 10 0
AreaID 1398667 0.02 1 2 0 2 0
ManagedAreaName 1398667 0.02 29 38 0 2 0
Activity.Type 1101931 0.23 5 35 0 12 0
RelativeDepth 1305519 0.08 3 7 0 3 0
TotalDepth_m 1422555 0.00 6 8 0 11 0
MDL 1102841 0.22 0 11 110146 241 0
PQL 1102841 0.22 0 11 110146 180 0
DetectionUnit 1420452 0.00 4 4 0 1 0
Value.Qualifier 894177 0.37 0 5 184005 127 0
ValueQualifierSource 1422618 0.00 NA NA 0 0 0
Result.Comments 1102753 0.22 0 874 280333 939 11
SEACAR_QAQCFlagCode 1398667 0.02 2 13 0 26 0
SEACAR_QAQC_Description 1398667 0.02 31 195 0 26 0
Include 1398667 0.02 1 1 0 2 0
MADup 1398667 0.02 1 1 0 1 0
ExportVersion 1398667 0.02 23 23 0 3 0
Region 1302832 0.08 2 19 0 22 0
DEP.Result.Unit 172638 0.88 0 10 18892 21 0
original.analyte.name 0 1.00 2 44 0 61 0
program 0 1.00 3 13 0 11 0
ProgramName 1398667 0.02 25 36 0 2 0
ParameterName 1399577 0.02 2 23 0 10 0
ParameterUnits 1399577 0.02 3 9 0 6 0
ProgramLocationID 1399577 0.02 1 2 0 55 0
ActivityType 1399577 0.02 5 6 0 2 0
SampleDate 1399577 0.02 23 23 0 2134 0
ActivityDepth_m 1422618 0.00 NA NA 0 0 0
ValueQualifier 1422618 0.00 NA NA 0 0 0
SampleFraction 1422618 0.00 NA NA 0 0 0
ResultComments 1420452 0.00 21 21 0 1 0
SEACAR_EventID 1399577 0.02 36 36 0 2394 0
data_source 1393652 0.02 5 10 0 2 0
CLIENT SAMPLE ID 1416693 0.00 1 3 0 70 0
LAB SAMPLE ID 1416693 0.00 11 11 0 494 0
MATRIX 1416693 0.00 5 5 0 1 0
COLLECTED 1416693 0.00 10 10 0 23 0
ANALYTE 1416693 0.00 8 27 0 12 0
SAMPLE RESULT 1417663 0.00 3 9 0 1545 0
REPORTING LIMIT 1419161 0.00 1 6 0 10 0
UNITS 1416693 0.00 3 10 0 8 0
METHOD 1416693 0.00 10 29 0 8 0
DILUTION 1416693 0.00 1 2 0 5 0
ANALYZED 1416693 0.00 10 10 0 123 0
PREPARED 1416693 0.00 10 10 0 127 0
source_file 1416693 0.00 19 22 0 23 0
SAMPLE COLLECTION DATE 1416693 0.00 9 10 0 23 0
DEP.Result.ID 1005961 0.29 3 8 0 327085 0
BASIN 1058748 0.26 2 5 0 6 0
CLUSTER 1058748 0.26 2 4 0 37 0
ZSI 1059903 0.25 2 5 0 25 0
ZONE 1331373 0.06 3 3 0 4 0
Long Deg 1326633 0.07 2 3 0 3 0
Long Min 1326633 0.07 1 22 0 4494 0
Lat Deg 1326633 0.07 2 2 0 2 0
Lat Min 1326633 0.07 1 21 0 3933 0
Organization.ID 582901 0.59 7 9 0 10 0
Org.Latitude..DD.MM.SS.SSSS. 1102841 0.22 0 11 309205 22 0
Org.Longitude..DD.MM.SS.SSSS. 1102841 0.22 0 12 309205 22 0
WBID 1102841 0.22 0 6 37805 115 0
Sample.Collection.Type 1102841 0.22 0 22 10452 4 0
Sampling.Agency.Name 585581 0.59 7 58 0 9 0
Activity.Depth.Unit 1102841 0.22 0 2 24857 3 0
Activity.Top.Depth 1102841 0.22 0 0 319777 1 0
Activity.Bottom.Depth 1102841 0.22 0 0 319777 1 0
Activity.Depth.Top.Bottom.Unit 1102841 0.22 0 0 319777 1 0
DEP.Result.Value.Text 1102841 0.22 0 12 311626 2 0
Sample.Fraction 1102841 0.22 0 9 131120 3 0
Lab.ID 1102841 0.22 0 6 102313 10 0
Audit.Censored.Decisions 1102841 0.22 0 0 319777 1 0
source 1102841 0.22 3 9 0 6 0
Value.1 964290 0.32 1 6 0 4499 0
Station 1419938 0.00 3 3 0 112 0
Date 1419938 0.00 11 14 0 663 0

Variable type: Date

skim_variable n_missing complete_rate min max median n_unique
Activity.Start.Date.Time 623217 0.56 23-09-01 2024-11-17 2019-11-20 2395

Variable type: logical

skim_variable n_missing complete_rate mean count
nisk_start 1422618 0 NaN :
nisk_end 1422618 0 NaN :
TIME 1422618 0 NaN :
DETECTION LIMITS 1422618 0 NaN :
NO3 DL 1422618 0 NaN :
DIN DL 1422618 0 NaN :
TON DL 1422618 0 NaN :
APA DL 1422618 0 NaN :

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100 hist
year 1329470 0.07 2020.43 2.89 2014.00 2018.00 2021.00 2023.00 2024.00 ▃▃▅▆▇
month 1329448 0.07 6.69 3.52 0.00 3.00 7.00 10.00 12.00 ▅▅▇▅▇
day 1329470 0.07 14.58 8.58 1.00 7.00 14.00 22.00 31.00 ▇▆▆▆▃
Org.Decimal.Latitude 560180 0.61 25.54 0.60 24.00 25.12 25.54 25.87 28.00 ▁▇▆▁▁
lat_min 1329470 0.07 30.30 15.98 0.03 20.41 33.11 42.78 59.83 ▆▅▇▇▃
lat_dec 1329470 0.07 25.77 1.02 24.40 24.93 25.45 26.58 28.78 ▇▆▃▃▁
Org.Decimal.Longitude 560180 0.61 -80.77 0.65 -85.00 -81.12 -80.61 -80.21 -80.00 ▁▁▁▃▇
lon_min 1329470 0.07 27.76 17.24 0.00 12.94 24.81 43.25 60.00 ▇▇▆▅▆
lon_dec 1329470 0.07 -81.77 0.93 -85.02 -82.49 -81.65 -81.17 -80.04 ▁▂▅▇▃
Activity.Depth 1080237 0.24 0.48 0.17 0.00 0.50 0.50 0.50 1.00 ▂▁▇▂▁
cast 1329470 0.07 0.72 0.46 0.00 0.00 1.00 1.00 2.00 ▃▁▇▁▁
DEP.Result.Value.Number 103767 0.93 1258.27 5457.01 -0.83 0.10 1.00 10.64 178550.00 ▇▁▁▁▁
Year 631822 0.56 2011.40 9.90 1989.48 2002.18 2018.00 2020.00 2024.00 ▁▃▃▁▇
Month 631800 0.56 6.57 3.45 0.00 4.00 7.00 10.00 12.00 ▅▅▇▆▇
ResultValue 1399577 0.02 12.86 41.22 0.00 0.29 4.89 24.47 5389.00 ▇▁▁▁▁
OriginalLatitude 1399577 0.02 25.80 0.02 25.77 25.78 25.79 25.81 25.87 ▆▇▂▁▂
OriginalLongitude 1399577 0.02 -80.15 0.01 -80.17 -80.16 -80.15 -80.13 -80.12 ▇▆▅▅▇
SURV 1058748 0.26 121.76 53.81 -8.00 80.00 125.00 167.00 211.00 ▂▆▇▇▇
STA 1058748 0.26 142.34 163.89 1.00 28.00 62.00 133.00 479.00 ▇▂▁▁▂
YEAR 1058748 0.26 2001.26 4.50 1989.48 1997.79 2001.56 2005.03 2008.73 ▁▅▇▇▇
NOX DL 1058748 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ▇▃▆▁▁
NO2 DL 1058748 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ▆▇▁▁▁
NH4 DL 1058748 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.01 ▇▁▁▁▂
TN DL 1058748 0.26 0.03 0.03 0.00 0.00 0.03 0.05 0.08 ▇▃▁▆▂
TP DL 1058748 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ▇▁▁▁▁
SRP DL 1058748 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ▇▂▁▅▅
CHLA DL 1058748 0.26 0.10 0.00 0.10 0.10 0.10 0.10 0.10 ▁▁▇▁▁
TOC DL 1058748 0.26 0.12 0.04 0.05 0.12 0.12 0.16 0.16 ▅▁▁▇▆
SiO2 DL 1058748 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.01 ▇▁▁▁▁
Create artistic data image
library(dplyr)
library(reshape2)  # for melt()
library(ggplot2)
library(viridis)
library(RColorBrewer)# for scale_fill_distiller()

# 1. Extract & drop NA
vals_raw <- df$DEP.Result.Value.Number
vals_raw <- vals_raw[!is.na(vals_raw)]

# 2. Log-transform
v1 <- log10(vals_raw + 1)

# 3. Percentile of the log-values
pct1 <- ecdf(v1)(v1)

# 4. Grid dims
N    <- length(pct1)
ncol <- ceiling(sqrt(N))
nrow <- ceiling(N / ncol)

# 5. Pad
pad_len <- (nrow * ncol) - N
p1_pad  <- c(pct1, rep(NA, pad_len))

# 6. Matrix & melt
mat_p1 <- matrix(p1_pad, nrow = nrow, ncol = ncol, byrow = TRUE)
mat_long_p1 <- melt(mat_p1, varnames = c("row","col"), value.name = "pct_log")

# 7. Plot
ggplot(mat_long_p1, aes(x = col, y = row, fill = pct_log)) +
  geom_tile(color = NA) +
  scale_fill_distiller(
    palette   = "Spectral",  # try "RdYlBu", "PuOr", "BrBG", etc.
    direction = 1,           # reverse=FALSE so low values start at red-ish end
    na.value  = "grey90",    # color for the padded NA cells
    guide     = "none"       # hide the legend; remove if you want a colorbar
  ) +
  scale_y_reverse() +
  theme_void() +
  theme(legend.position = "none")


If you have visualization ideas for this data, please open a github issue here.