Osinto - The Artificial Intelligence Agency - Labs Abstract background.png

AI
LABS

Osinto's Labs initiative is where we build and experiment - where we research intelligence.

WHAT'S THIS ALL ABOUT?

We get excited by pushing back the boundaries of what's possible, undertaking applied research in a commercial setting. To do so we conduct practical experiments, testing ideas against the harsh realities of international commerce.

We're not backed by government grants or philanthropic donations - Osinto is a for-profit business. We work for ourselves and our clients. So all of our experiments are inherently commercial in nature. Our experience tells us that commercially driven, applied research can be incredibly productive and efficient. We're inspired by what's been achieved at outfits like Alphabet's X and Lockheed Martin's Skunk Works and seek to build something similarly ambitious and disruptive.

We list on this page some of the experiments we would like to undertake in the hope of finding collaborators and backers.

If you see something that's of interest and you want to work on it with us - get in touch!

Partners on a project might come with cash, equipment, expertise, people - or a combination of them all. You might want us to work on this privately and keep the findings to ourselves. You might also want to 'build in public' - opening up to share findings and data with the wider world.

We don't come with preconceptions - instead focusing on getting things done. These might be spun out as separate ventures, they might die a death after a matter of weeks if hypotheses don't stand the test of real world scrutiny. We're not precious about our ideas - so come with criticism, tell us why we're crazy or ill-informed - we learn from it all.

DEFINING THE UNKNOWN

STRUCTURE AMIDST CHAOS

Working at the forefront of anything can be chaotic. This is how we attempt to bring structure and discipline to our research experiments. The Experiment Card below is Version 0.1 of our template for defining an Osinto Labs Experiment. The intent is to provide a guiding framework that helps us gather our thoughts, direct our efforts and lay out for potential partners a succinct vision of what we're trying to do, why and what we need to make it happen.

These are experiments - so we expect many / most of them to fail, and that's fine. Of course we're always aiming for success!

The idea is always to get started, quickly. And sometimes to finish an experiment just as fast. Life is too short to be precious about ideas - if a hypothesis is quickly disproven, we simply stop and move on to the next experiment.

EXPERIMENT TITLE

EXPERIMENT SUBTITLE

Description

Explain in a few sentences what this experiment seeks to achieve, and why.

What we want to do

Outline the work the Osinto team intends to undertake within the framework of this experiment and the anticipated outcomes through which we'll use to measure success.

Our unfair advantages

We like to play in areas where we bring some unique insights or leverage - they're outlined here.

What we need:

This is where we list what we're lacking in order to get started, it'll nearly always consist of:

Money - a budget to cover experiment costs
Non-financial resources - this might include specialist equipment we need for a project
People - who do we need to make this happen? Defining expertise we don't have, but that the experiment requires

AI EXPERIMENTS

EXPERIMENT I
BIOACOUSTIC MONITORING PROTOTYPE

NETWORK OF LOW COST AUDIO SENSORS COUPLED WITH MACHINE LEARNING MODEL(S) TO AID IN PERSISTENT DETECTION AND IDENTIFICATION OF WILDLIFE SPECIES FOR BIODIVERSITY MONITORING

Description

We hypothesise that a confluence of low-cost consumer electronics (eg. USB microphones, Raspberry Pi computers, smartphones, lithium-ion batteries, small solar PV panels) together with open source Machine Learning (ML) models, make it possible to build an array of sensors for the persistent, automated detection and monitoring of wildlife, at significantly lower cost than has been possible to date using more manual, sporadic methods of measurement (eg. on-site surveys by ecologists).

With an increasing number of 'rewilding' and biodiversity monitoring projects ongoing worldwide, and an increasing range of attractive subsidy mechanisms coming into force to encourage such projects, we understand there to also be a growing need to monitor species present at a given location over time, so as to prove net gains in biodiversity eg. in order to receive subsidy payments, as well as to aid in wider species monitoring and conservation efforts.

At present much of this work is done manually by ecologists, but only for the limited time periods they are able to be on-site eg. with photographic or sound recording equipment, during which there is also inevitably some disturbance of an ecosystem through human presence.

An automated, persistent bioacoustic monitoring and identification system should aim to reduce cost (vs. repeated site visits) and dramatically increase data volume and quality. We also hope to prove that use of machine learning models to sort signal from noise, de-duplicate and pre-categorise data can aid human ecologist in their analysis of monitoring data.

What we want to do

Establishment of a prototype device and small test network in a biodiverse area in a national park in the UK to see if the hypotheses above stand up to practical experimentation.

Anticipate a 3-month project from outset.

Outcomes expected to include:

Determination of viability of producing such a network
Determination of forecast cost to provide bioacoustic-detection-network-as-a-service
Feedback from prospective clients for a network on functionality and pricing

Our unfair advantages

24/7 access to private plot of land in South Downs National Park with 25+ bird and mammal species already identified
Access to pool of professional ecologists on weekly basis for advice and feedback
Initial research completed to identify possible ML libraries, possible device design and component list, competing technologies at mature end (eg. radar based systems for bird detection at offshore wind farms) and lower end (smartphone apps, commercial audio capture devices used by ecologists in UK)
Prospective reserved access to a cluster of low cost, specialised compute in the UK for ML model training
Direct professional links to one of UK's foremost rewilding projects for feedback and possible later, commercial scale test

What we need

£50,000 - MLOps Engineer time, Data Scientist time, hardware procurement, cloud services setup for data storage, budget for GPU-accelerated compute rental for ML model training
We have all expertise and contacts within our existing network to complete this project

WHAT'S NEXT?

BUILDING A PIPELINE

Below is a list of Experiments we'd like to / are spinning up. If you see something here your impatient to work with us on, and you can bring to the table money or other useful resources to make it happen, drop an email to hello@osinto.com with the Experiment title, or contact us here:

GAINING MODEL FAMILIARITY

FINE TUNE AN IMAGE GENERATION MODEL ON A DOMAIN-SPECIFIC DATASET
FINE TUNE A VIDEO GENERATION MODEL ON A DOMAIN-SPECIFIC DATASET
FINE TUNE OPEN SOURCE LLM WITH UKRAINIAN LANGUAGE
TEST STATE OF OPEN SOURCE COMPONENTS FOR BUILDING LOCAL AI ASSISTANT WITH CONVERSATIONAL SPEECH-TO-TEXT + TEXT-TO-SPEECH INTERFACE, RUNNING LOCALLY ON A 'NORMAL' SPEC LAPTOP

SOFTWARE PRODUCT PROTOTYPING / FEASIBILITY TESTING

MIXTURE OF BUSINESS EXPERTS - BUILDING AN AI BOARD TO SPA WITH
CAN AI IMAGE GENERATORS BE USED TO QUICKLY CREATE RENDERS FROM ISOMETRIC ARCHITECTURAL DRAWINGS TO QUICKLY HELP NON-EXPERTS VISUALISE A SITE
OPEN SOURCE KUBERNETES-BASED ORCHESTRATION PLATFORM FOR CLUSTER MGMT
MULTI-MODAL, LOCAL, PRIVATE AI ASSISTANT BUILT ON OPEN SOURCE
OPENRAG -BUILD A PERSONAL RETRIEVAL AUGMENTED GENERATION TOOL FOR PDF CHAT
MULTI-DIMENSIONAL MARKET MAPPING TOOL
PAIR A BASIC KNOWLEDGE GRAPH WITH AN OPEN SOURCE LLM

INFRASTRUCTURE / INVESTMENT

HOW CLOSE CAN A PCIE ARCHITECTURE SERVER WITH CONSUMER GRADE GPUS GET TO STATE-OF-THE-ART LLM TRAINING PERFORMANCE ON AN H100 NVLINK NODE?
SOUTH COAST SUPERCOMPUTER - BUILDING A GPU CLUSTER IN SUSSEX, UK
COMBINED HEAT & COMPUTE DEMONSTRATOR - DATA CENTRE + COMMERCIAL HEAT RE-USE
HYPER-RESILIENT, OFF-GRID CONTAINERISED AI COMPUTE DEMONSTRATOR
CARBON NEUTRAL AI COMPUTE CLUSTER PROOF-OF-CONCEPT
GPU HOLDING FUND

EXPLORATORY / CONCEPTUAL RESEARCH

PROBING LIMITS OF PROMPT ENGINEERING WITH LLMS
THE ETYMOLOGICON - VISUAL MAPPING OF SEMANTIC RELATIONSHIPS / WORDNET SYNSETS
EXPLORING HOW TO VISUALISE HIGH-DIMENSIONAL DATA
MINIMUM VIABLE WORLD MODELS FOR LLM GROUNDING / HALLUCINATION REDUCTION

THE ARTIFICIAL INTELLIGENCE AGENCY

Learn from our AI experts.

THE TEAM

WHAT WE DO

AI LABS