Blog d’Anita Graser

https://anitagraser.com

  • 16 septembre 2023Data engineering for Mobility Data Science (with Python and DVC)
    This summer, I had the honor to — once again — speak at the OpenGeoHub Summer School. This time, I wanted to challenge the students and myself by not just doing MovingPandas but by introducing both MovingPandas and DVC for Mobility Data Science. I’ve previously written about DVC and how it may be used to track geoprocessing workflows with QGIS & DVC. In my summer school session, we go into details on how to use DVC to keep track of MovingPandas movement data analytics workflow. Here is the recording of the session live stream and you can find the materials at https://github.com/movingpandas/movingpandas-examples/blob/opengeohub2023/README.md [youtube https://www.youtube.com/watch?v=roPF1oth2Pk?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en&autohide=2&wmode=transparent&w=545 …
  • 1 septembre 2023Comparing geographic data analysis in R and Python
    Today, I want to point out a blog post over at https://geocompx.org/post/2023/ogh23/ written together with my fellow “Geocomputation with Python” co-authors Robin Lovelace, Michael Dorman, and Jakub Nowosad. In this blog post, we talk about our experience teaching R and Python for geocomputation. The context of this blog post is the OpenGeoHub Summer School 2023 which has courses on R, Python and Julia. The focus of the blog post is on geographic vector data, meaning points, lines, polygons (and their ‘multi’ variants) and the attributes associated with them. We plan to cover raster data in a future post. …
  • 20 août 2023I’ve archived my Tweets: Goodbye Twitter, Hello Mastodon
    Today, Jeff Sikes @box464@firefish.social, alerted me to the fact that “Twitter has removed all media attachments from 2014 and prior” (source: https://firefish.social/notes/9imgvtckzqffboxt). So far, it seems unclear whether this was intentional or a system failure (source: https://mas.to/@carnage4life/110922114407553901). Since I’ve been on Twitter since 2011, this means that some media files are now lost. While the loss of a few low-res images is probably not a major loss for humanity, I would prefer to have some control over when and how content I created vanishes. So, to avoid losing more content, I have followed Jeff’s recommendation to create a proper archival page: https://anitagraser.github.io/twitter-archive/ It is based on an export I pulled in October 2022 when I started to use Mastodon as my primary social media account. Unfortunately, this export did not include media files. To follow me in the future, find me on: https://fosstodon.org/@underdarkGIS Btw, a recent study published on Nature News shows that Mastodon is the top-ranking Twitter replacement for scientists. To find other interesting people on Mastodon, there are many useful tools and lists, including, for ex …
  • 21 mai 2023Analyzing video-based bicycle trajectories
    Did you know that MovingPandas also supports local image coordinates? Indeed, it does. In today’s post, we will explore how we can use this feature to analyze bicycle tracks extracted from video footage published by Michael Szell @mszll: Dataset: https://zenodo.org/record/7288616 Data description: https://arxiv.org/abs/2211.01301 The bicycle trajectory coordinates are stored in two separate lists: xs_640x360 and ys640x360: This format is kind of similar to the Kaggle Taxi dataset, we worked with in the previous post. However, to use the solution we implemented there, we need to combine the x and y coordinates into nice (x,y) tuples: df[‘coordinates’] = df.apply( lambda row: list(zip(row[‘xs_640x360’], row[‘ys_640x360’])), axis=1) df.drop(columns=[‘xs_640x360’, ‘ys_640x360’], inplace=True) Afterwards, we can create the points and compute the proper timestamps from the frame numbers: def compute_datetime(row): # some educated guessing going on here: the paper states that the video covers 2021-06-09 07:00-08:00 d = datetime(2021,6,9,7,0,0) + (row[‘frame_in’] + row[‘running_number’]) * timedelta(seconds=2) return d def create_point(xy): try: return Point(xy) except TypeError: # when th …
  • 12 mai 2023How to use Kaggle’s Taxi Trajectory Data in MovingPandas
    Kaggle’s “Taxi Trajectory Data from ECML/PKDD 15: Taxi Trip Time Prediction (II) Competition” is one of the most used mobility / vehicle trajectory datasets in computer science. However, in contrast to other similar datasets, Kaggle’s taxi trajectories are provided in a format that is not readily usable in MovingPandas since the spatiotemporal information is provided as: TIMESTAMP: (integer) Unix Timestamp (in seconds). It identifies the trip’s start; POLYLINE: (String): It contains a list of GPS coordinates (i.e. WGS84 format) mapped as a string. The beginning and the end of the string are identified with brackets (i.e. [ and ], respectively). Each pair of coordinates is also identified by the same brackets as [LONGITUDE, LATITUDE]. This list contains one pair of coordinates for each 15 seconds of trip. The last list item corresponds to the trip’s destination while the first one represents its start; Therefore, we need to create a DataFrame with one point + timestamp per row before we can use MovingPandas to create Trajectories and analyze them. But first things first. Let’s download the dataset: import datetime import pandas as pd import geopandas as gpd import movingpandas as mp …
  • 30 mars 2023Deep learning from trajectory data
    I’ve previously written about Movement data in GIS and the AI hype and today’s post is a follow-up in which I want to share with you a new review of the state of the art in deep learning from trajectory data. Our review covers 8 use cases: Location classification Arrival time prediction Traffic flow / activity prediction Trajectory prediction Trajectory classification Next location prediction Anomaly detection Synthetic data generation We particularly looked into the trajectory data preprocessing steps and the specific movement data representation used as input to train the neutral networks: On a completely subjective note: the price for most surprising approach goes to natural language processing (NLP) Transfomers for traffic volume prediction. The paper was presented at BMDA2023 and you can watch the full talk recording here: [youtube https://www.youtube.com/watch?v=zHI-52U8gjU?version=3&rel=1&showsearch=0&showinfo=1&iv_load_policy=1&fs=1&hl=en&autohide=2&wmode=transparent&w=545&h=307] References Graser, A., Jalali, A., Lampert, J., Weißenfeld, A., & Janowicz, K. (2023). Deep Learning From Trajectory Data: a Review of Neural Networks and the Trajectory Data Representations to Tra …
  • 25 février 2023Tracking geoprocessing workflows with QGIS & DVC
    Today’s post is a geeky deep dive into how to leverage DVC (not just) data version control to track QGIS geoprocessing workflows. “Why is this great?” you may ask. DVC tracks data, parameters, and code. If anything changes, we simply rerun the process and DVC will figure out which stages need to be recomputed and which can be skipped by re-using cached results. This can lead to huge time savings compared to re-running the whole model You can find the source code used in this post on my repo https://github.com/anitagraser/QGIS-resources/tree/dvc I’m using DVC with the DVC plugin for VSCode but DVC can be used completely from the command line, if you prefer this appraoch. Basically, what follows is a proof of concept: converting a QGIS Processing model to a DVC workflow. In the following screenshot, you can see the main stages The QGIS model in the upper left corner The Python script exported from the QGIS model builder in the lower left corner The DVC stages in my dvc.yaml file in the upper right corner (And please ignore the hello world stage. It’s a left over from my first experiment) The DVC DAG visualizing the sequence of stages. Looks similar to the QGIS model, doesn’t it 😉 B …
  • 21 janvier 2023PyQGIS Jupyter notebooks on Windows using Conda
    The QGIS conda packages have been around for a while. One of their use cases, for example, is to allow Linux users to easily install multiple versions of QGIS. Similarly, we’ve seen posts on using PyQGIS in Jupyter notebooks. However, I find the setup with *.bat files rather tricky. This post presents a way to set up a conda environment with QGIS that is ready to be used in Jupyter notebooks. The first steps are to create a new environment and install QGIS. I use mamba for the installation step because it is faster than conda but you can use conda as well: (base) PS C:\Users\anita> conda create -n qgis python=3.9 (base) PS C:\Users\anita> conda activate qgis (qgis) PS C:\Users\anita> mamba install -c conda-forge qgis=3.28.2 (qgis) PS C:\Users\anita> qgis If we now try to import the qgis module in Python, we get an error: (qgis) PS C:\Users\anita> python Python 3.9.15 | packaged by conda-forge | (main, Nov 22 2022, 08:41:22) [MSC v.1929 64 bit (AMD64)] on win32 Type « help », « copyright », « credits » or « license » for more information. >>> import qgis Traceback (most recent call last): File «  », line 1, in ModuleNotFoundError: No module named ‘qgis’ To fix this error, we need to get the p …
  • 29 décembre 2022MovingPandas v0.13 & v0.14 released!
    December has been busy with two new MovingPandas releases: v0.13 and v0.14. The latest v0.14 release is now available from conda-forge. These releases are a huge step forward towards making MovingPandas easier to install with fewer mandatory dependencies. All interactive plotting libraries are now optional. So if you are using MovingPandas for trajectory data processing in the background and don’t need the interactive visualization features, the number of necessary libraries is now much lower. This (and the fact that GeoPandas is now shipped with OSGeo4W) will also make it easier to use MovingPandas in QGIS plugins. New features: #268 New add_angular_difference method Includes fixes and enhancements for: #267 Improved documentation: direction values are [0, 360) Behind the scenes: #269 Fixed read the docs build #261 Made interactive plotting libraries optional #257 Fixed broken pre-commit Created a Mastodon account As always, all tutorials are available from the movingpandas-examples repository and on MyBinder: If you have questions about using MovingPandas or just want to discuss new ideas, you’re welcome to join our discussion forum. …
  • 19 novembre 2022Visualizing trajectories with QGIS & MobilityDB
    In the previous post, we — creatively 😉 — used MobilityDB to visualize stationary IOT sensor measurements. This post covers the more obvious use case of visualizing trajectories. Thus bringing together the MobilityDB trajectories created in Detecting close encounters using MobilityDB 1.0 and visualization using Temporal Controller. Like in the previous post, the valueAtTimestamp function does the heavy lifting. This time, we also apply it to the geometry time series column called trip: SELECT mmsi, valueAtTimestamp(trip, ‘2017-05-07 08:55:40’) geom, valueAtTimestamp(SOG, ‘2017-05-07 08:55:40’) SOG FROM « public ». »ships » Using this SQL query, we again set up a — not yet Temporal Controller-controlled — QueryLayer. To configure Temporal Controller to update the timestamp in our SQL query, we again need to run the Python script from the previous post. With this done, we are all set up to animate and explore the movement patterns in our dataset: This post is part of a series. Read more about movement data in GIS. …