library(arrow)
library(beepr)
library(curl)
library(dplyr)
library(elevatr)
library(fs)
library(geobr)
library(h3jsr)
library(here)
library(htmltools)
library(lockr) # github.com/danielvartan/lockr
library(magrittr)
library(osfr)
library(osmdata)
library(raster)
library(readr)
library(sf)
library(sfarrow)
library(tictoc)A Reproducible Pipeline for Generating Hexagonal Grids of Brazilian Municipalities
This pipeline is a Work In Progress (WIP) and is under active development. It may not yet be stable or suitable for public use. Please use it with caution and report any issues you encounter.
Overview
This report presents a reproducible pipeline for generating hexagonal grids of Brazilian municipalities. The pipeline was developed in the R programming language by Flávio Soares and Clara Penz, with further adaptations by Daniel Vartanian.
For instructions on how to run the pipeline, see the repository README.
Problem
Data Availability
The processed data are available in both rds and parquet formats through a dedicated repository on the Open Science Framework (OSF). A metadata file is included alongside the validated datasets.
Because the raw data are not publicly available, only authorized personnel can access the processed files. They are protected with RSA 4096-bit encryption (OpenSSL) and a 32-byte password to ensure data security.
If you already have access to the OSF repository and the project keys, click here to access the data. A backup copy of the raw data is also stored on OSF and can be accessed here. You can also retrieve these files directly from R using the osfr package.
Methods
Source of Data
The data used in this report come from the following sources:
- Brazilian Institute of Geography and Statistics (IBGE): Data from the 2022 Brazilian Census.
- Mapzen Terrain Tile: Data on terrain and elevation.
- OpenStreetMap (OSM): Geospatial data on highways, roads, and other infrastructure for Brazilian municipalities.
- OpenTopography: Data on topography and elevation.
Data Munging
The data munging follow the data science workflow outlined by Wickham et al. (2023), as illustrated in Figure 1. All processes were made using the Quarto publishing system (Allaire et al., n.d.), the R programming language (R Core Team, n.d.) and several R packages.
For data manipulation and workflow, priority was given to packages from the tidyverse, rOpenSci and r-spatial ecosystems, as well as other packages adhering to the tidy tools manifesto (Wickham, 2023).
Source: Reproduced from Wickham et al. (2023).
Code Style
The Tidyverse code style guide and design principles were followed to ensure consistency and enhance readability.
Reproducibility
The pipeline is fully reproducible and can be run again at any time. To ensure consistent results, the renv package (Ushey & Wickham, n.d.) is used to manage and restore the R environment. See the README file in the code repository to learn how to run it.
Set the Environment
Load Packages
Set Keys
osf_pat <- Sys.getenv("OSF_PAT") # askpass()osf_auth(osf_pat)public_key <- here("_ssh", "id_rsa.pub")private_key <- here("_ssh", "id_rsa")password <- Sys.getenv("ACESSOSAN_PASSWORD") # askpass()Set Input and Output Paths
for (i in c(dir_inputs, dir_parcial)) {
if (!dir_exists(i)) {
dir_create(i, recurse = TRUE)
}
}Set Municipality Data
municipios <- c(
3550308, # São Paulo
2507507, # João Pessoa
3106200, # Belo Horizonte
4314902, # Porto Alegre
1721000, # Palmas
5300108, # Brasília
5208707 # Goiânia
)Set Initial Variables
set.seed(2025)Download IBGE Census Data
Download File
osf_raw_data_id <- "zuy4s"osf_raw_data_files <-
osf_raw_data_id |>
osf_retrieve_node() |>
osf_ls_files(
type = "file",
pattern = "censo2022_hex",
n_max = Inf
)
osf_raw_data_filesibge_2022_census_hex_file <-
osf_raw_data_files |>
osf_download(path = dir_inputs, conflicts = "overwrite") |>
extract2("local_path")Unlock File
ibge_2022_census_hex_file <-
ibge_2022_census_hex_file |>
unlock_file(
private_key = private_key,
suffix = ".lockr",
remove_file = TRUE,
password = password
)Download Brazil OSM Data
file.path(
"https://download.geofabrik.de",
"south-america",
"brazil-latest.osm.pbf"
) |>
curl_download(
destfile = here(dir_inputs, "brazil-latest.osm.pbf"),
quiet = FALSE
)osm_brazil_latest_file <- here(dir_inputs, "brazil-latest.osm.pbf")01.01-criar_malha_hexagonal_areas_total_urbana.R
Cria malhas hexagonais para os municípios a serem analisados.
Aqui teria 2 etapas que estão faltando no script:
- Trabalhar a área urbanizada do IBGE.
- Gerar hexágonos da área urbanizada.
Ler Hexágonos Urbanizados do Brasil com Dados do Censo 2022
hexurb <-
ibge_2022_census_hex_file |>
read_delim(delim = ",") |>
# Filtrar fora linhas com somente 0 em todas as variáveis exceto `h3_address`.
filter(!if_all(-h3_address, \(x) x == 0))
hexurb |> glimpse()Criar Malha Hexagonal e Separar entre Urbano e Não-Urbano
for (cod in municipios) {
# Baixar geometria do município.
municipio_geom <- read_municipality(code_muni = cod, year = 2020)
# Converter polígono para células H3.
hex <- polygon_to_cells(geometry = municipio_geom$geom, res = 9)
# Converter células H3 de volta para polígonos.
hexgrid <- cell_to_polygon(input = hex, simple = FALSE)
print(paste("Malha hexagonal criada para", cod))
# Separar hexágonos urbanizados.
hex_urb_mun <-
hexgrid |>
left_join(hexurb, by = "h3_address") |>
filter(if_all(-h3_address, ~ !is.na(.x))) |>
mutate(across(where(is.numeric) & !any_of("h3_address"), abs))
print("Filtragem realizada")
# Criar diretório.
dir_hex <- file.path(dir_parcial, cod, "hex")
dir.create(dir_hex, showWarnings = FALSE, recursive = TRUE)
# Salvar arquivos.
# Total
st_write_parquet(hexgrid, file.path(dir_hex, "hex.parquet"))
# Urbanizado
st_write_parquet(hex_urb_mun, file.path(dir_hex, "hex_urbanizado.parquet"))
}01.02-processar_elevation.R
Criar Arquivo .tiff para a Área Urbanizada de Cada Município
for (cod in municipios) {
# Definir caminho do arquivo `.tiff`.
elevation_path <- file.path(dir_parcial, cod, "elevation.tiff")
# Ler `hexgrid` do município
hexgrid <-
file.path(
dir_parcial,
cod,
"hex",
"hex_urbanizado.parquet"
) |>
st_read_parquet()
# Criar raster de elevação (zoom `z=13`)
elev_raster <-
hexgrid |>
get_elev_raster(
z = 13,
override_size_check = TRUE
)
# Salvar .tiff
writeRaster(elev_raster, elevation_path, overwrite = TRUE)
print(paste("Arquivo .tiff criado para município", cod))
}01.03-processar_osm.R
Criar Malha de Transporte para a Área Urbanizada de Cada Município
for (cod in municipios) {
print(paste("Processando", cod))
# Definir diretório de saída.
dir_mun <- file.path(dir_parcial, cod)
dir.create(dir_mun, showWarnings = FALSE, recursive = TRUE)
# Bounding box do município.
mun_hex <-
file.path(
dir_mun,
"hex",
"hex_urbanizado.parquet"
) |>
st_read_parquet()
mun_bbox <- st_bbox(mun_hex)
br_pbf <- osm_brazil_latest_file
mun_pbf <- file.path(dir_mun, "redeviaria.osm.pbf")
# Executa Osmosis.
tic(msg = paste("Extraindo malha OSM para", cod))
system2(
"osmosis",
args = c(
paste("--read-pbf", br_pbf),
"--bounding-box",
paste0("left=", mun_bbox["xmin"]),
paste0("bottom=", mun_bbox["ymin"]),
paste0("right=", mun_bbox["xmax"]),
paste0("top=", mun_bbox["ymax"]),
paste("--write-pbf", mun_pbf)
)
)
toc()
}Citation
When using this data, you must also cite the original data sources.
To cite this work, please use the following format:
Soares, F., Penz, C., Vartanian, D., Fernandes, C. N., & Giannotti, M. A. (2025). A reproducible pipeline for generating hexagonal grids of Brazilian municipalities [Computer software]. Center for Metropolitan Studies of the University of São Paulo. https://cem-usp.github.io/brazil-hexagonal-grid
A BibTeX entry for LaTeX users is
@software{soares2025,
title = {A reproducible pipeline for generating hexagonal grids of Brazilian municipalities},
author = {{Flávio Soares} and {Clara Penz} and {Daniel Vartanian} and {Camila Nastari Fernandes} and {Mariana Abrantes Giannotti}},
year = {2025},
address = {São Paulo},
institution = {Center for Metropolitan Studies of the University of São Paulo},
langid = {en},
url = {https://cem-usp.github.io/brazil-hexagonal-grid}
}License
The original data sources may be subject to their own licensing terms and conditions.
The code in this report is licensed under the GNU General Public License Version 3, while the report is available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.
Copyright (C) 2025 Center for Metropolitan Studies
The code in this report is free software: you can redistribute it and/or
modify it under the terms of the GNU General Public License as published by the
Free Software Foundation, either version 3 of the License, or (at your option)
any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY
WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with
this program. If not, see <https://www.gnu.org/licenses/>.
