📍 Global 🌎
Vincent Clemson
Education
Career Background
👋 Hi I’m Vincent. Over the past two years, I have traveled around the world, exploring 24 new countries & 10 US states.
I blog about using open source tools to analyze & visualize data. Previously, I worked at Booz Allen where I conducted & led a study on NATO1 object detectors using commercial satellite imagery on the Bighorn AI edge kit team. Prior, I worked at Peraton for five years on a GEOINT2 performance modeling & analytics team, where I analyzed billions of records of transactional data on geospatial imagery stored by the NGA3 to help optimize the NSG4. Additionally, I conducted data driven analyses in the fields of remote sensing & astrodynamics to support the systems engineering & integration of the US national security satellite mission NTM5-L. Here, I promoted open source software development best practices and data science tooling usage in Python & R. I managed my team’s corporate & government GitHub accounts, growing the organization from 0 to 300+ repositories.
AI Engineer – Booz Allen Hamilton – CTO
- Conducted a statistical performance analysis for evaluating Preligens’ aircraft & building geospatial computer vision algorithm detector models for Booz Allen’s Chief Technology Office 🛩 🏘️
i.e. Developed a Python module for querying web API against thousands of labelled Maxar satellite images, leveraged polars for data engineering & conducting ETL, used R’s {sf} & {terra} for spatial analysis & performance stats, & created a Quarto website for documenting entire analyses, from abstract to results.
- Designed & prototyped multiple iterations of a human performance dashboard for JSOC6 using Oura Ring soldier biometric performance data (e.g. R {flexdashboard}, PowerBI, & Tableau)
Systems Engineer – Peraton
- Data science & analytics on the NGA’s enterprise systems engineering contract (NEE/SEIN)
- Analyzed the majority of imagery product transactions sent across the IC (e.g. NRO7/NGA’s ground CONOPS8)
- Wrangled large amounts of historical categorical, numerical, spatial, & temporal GEOINT metadata using extremely efficient in/out of memory tools (e.g. data.table, Apache Arrow, & Parquet file data lakes)
- Designed ETLs to pull and join data from disparate sources (e.g. APIs, S3 buckets, and databases) into tidy datasets for analysis (e.g. tracked imagery transaction timelines from satellite tasking to processing / exploitation)
- Designed, developed, & maintained dashboards, visualizations (plots & charts), & data lakes for reporting engineering performance statistics on satellite camera sensors, as well as military base / intelligence site comms
- Conducted orbital mechanics analyses using ephemeris & simulators of an ABI9 satellite / ground sensor system
- Prototyped, developed, & maintained modeling tools to conduct EDA on data for analyzing patterns, trends, & spatial/geometric relationships (e.g. ggplot2, sf, Plotly, Matplotlib, Leaflet, Dash, Shiny, Docker, & Cloud Foundry)
- Statistical analysis on the performance, sizing, & budgeting of NSG imagery & their driving relationships
(e.g. linear trend models, bandwidth models, human-in-the-loop supervised/unsupervised EDA ML workflows) - Worked on a distributed team & operated in a cloud computing environment. Experience with building a cloud from the ground up, config management, & permissions (e.g. AWS, RStudio Server Pro, Unix/Linux, VPC)
- Gathered analytics requirements from lead/chief engineers, as well as mentored team from juniors, peers, to leads on analytic capabilities in 1-on-1, open forum, & presentation environments.
Application Developer Intern – JP Morgan Chase
- Agile development team in JP’s Technology Analyst Program. Team of six interns built a full stack Java-Spring tool aggregating data for the planning & execution of the migration & decommissioning of legacy JPMC data center servers. Worked frontend & backend. Led role as Scrum Master.
Data Analytics Intern – IMG Learfield & Penn State Athletics
- Analyzed unstructured season ticket holder survey text data using NLP10 techniques & the NLTK in Python (e.g. tokenizers, collocations)
- Performed Decision Tree Modelling in R for finding trends between customers and ticket sale renewals
What I’m interested in doing
Projects on the Web
- Combining math & code into bite sized technical explanatory articles on my travel themed Project Euler listing
- Designed aesthetic & interactive web maps in R using ggplot2, JavaScript, & SVG
- Rapidly visualized natural disasters around the 🌍 by developing web map tools that dynamically tile satellite imagery from Maxar’s Open Data Program using STAC11 & Leafmap
- Developed & Deployed analytic Dash & {shiny} web apps in Python & R for NGA mission analysts to predict years of geospatial coverage of NTM Earth Observation Satellite Systems
- Built Spatial Machine Learning Models using {mlr3} in R to run within GitHub Actions
- Worked through all of Tomas Beuzen’s Deep Learning with PyTorch & ported it to render w/ nbdev & Quarto
- Improved dev & data science workflows by teaching engineers to version control their code using Git
- Created Quarto, R & RStudio CLI utility shims to handle multiple Quarto/R installations
- Started developing an R package, {leaflet.super}, to visualize big geospatial data with Leaflet & Arrow
- Built exploratory unsupervised clustering tools to drive insight from imagery analyst user activity data
- Military wartime border region behavior is of interest within the geospatial intelligence domain, so I built tools for analyzing satellite image distance to border regions & for creating new geometric border regions
-
Quantifying size & types of collections (e.g different camera sensor modes) is critical in Earth observation satellite systems engineering, so I’ve built different analytical product mapping tools to help do so:
e.g. advanced Plotly map animations & interactive Leaflet htmlwidget heatmaps - I version control my MacOS dot-profile & config files for rapid dev/data-sci setup
Programming Skills
Below is a non-exhaustive high level list of the technologies that I’m working in.
Python, Conda/Mamba, Jupyter, numpy, pandas, Docker, Kubernetes, SQL, JavaScript, Node.js, Bash, Zsh, tmux, VSCode, R, Quarto, R Markdown, GNU Make, GitHub Actions, Leafmap, Google Earth Engine, QGIS, GDAL
Machine Learning Skills
Spatial Cross-Validation Techniques, Discrete Event Simulation, Generalized Linear Models, Ensemble Models, Unsupervised Learning, Principal Components Analysis, Clustering Techniques, Feature Selection
CNNs (Convolutional Neural Networks), GANs (Generative Adversarial Networks), Gradient Descent, Regularization, Decision Boundary, One-vs-All Multiclass Classification, Backpropagation and Advanced Optimization techniques
NATO - North Atlantic Treaty Organization ↩︎
GEOINT - Geospatial Intelligence ↩︎
NGA - National Geospatial-Intelligence Agency ↩︎
NSG - National System for Geospatial-Intelligence ↩︎
NTM - National Technical Means ↩︎
JSOC - Joint Special Operations Command ↩︎
NRO - National Reconnaissance Office ↩︎
CONOPS - Concept of Operations ↩︎
ABI - Activity Based Intelligence ↩︎
NLP - Natural Language Processing↩︎
STAC - SpatioTemporal Asset Catalog ↩︎
SSG - Static Site Generator ↩︎