Data Pitch Project - Datasets Catalogue

Filter datasets by
Data type
Altice Labs - IPTV Activity Logs Personalised Entertainment
DATA THEME: IPTV Activity Logs
DESCRIPTION: These logs include the large majority of activities that are performed by IPTV clients and which are recorded as an activity log, and include both TV usage and viewing pattern information. Records of activity logs represent activities executed on each IPTV device (set-top-box or mobile app – MeoGo - based) by one or more viewers.
Field names - Type (notes):
id - integer NOT NULL (sequential)
actlog_box_ref - UUID (STB internal identifier (anonymized))
program_id - integer (Programs key)
exhibition_id - integer (Exhibition key)
channel_type - text ("LIVE" ,"GA"(Automatic Recording) or VOD)
duration - integer (Milliseconds)
stream_type - text (FULLSCREEN PIP(Picture in picture))
station_id_ref - integer (Channels key)
actlog_program_ref - integer (Relates viewing to program)
start_date - timestamp(3) with time zone (Start viewing)

BUSINESS SECTORS: Altice/MEO IPTV service
DATA FORMAT: Files
DATA TYPES: IPTV viewing consumer data
COLLECTION TERRITORIES: Portugal
PERSONAL DATA: Anonymized data derived from personal data
KEYWORDS: IPTV ; viewing records
HISTORY: 3 months of viewing records
STRUCTURE: Highly structured
CHALLENGE PAGE: Altice Labs Challange Page
Altice Labs - EPG Personalised Entertainment
DATA THEME: EPG
DESCRIPTION: o Electronic Program Guides is a data stream comprehending all the information about programs and channels, such as scheduled date and time, description, duration, etc. To allow a more agile share of information, the EPG divides Program and Channel in different input files.
Programs > Field names - Type (notes):
id - integer NOT NULL (sequential)
title - text (Program title)
media_type - text ("program" or "series".)
season - integer (Season number)
episode - integer (Episode number)
num_episodes - integer (Number of episodes)
series_id - integer[] (Series key)
is_adult -boolean (If is adult channel)
participants - text
director - text
genre_id - integer[] (Genre keys)
updated_at - timestamp(3) with time zone (Data do último update feito à entrada)
synopsis - text (Descrição em texto do programa)

Channels > Field names - Type (notes):
id - integer NOT NULL (sequential)
channel_name - text (Channel name (HD+SD))
thematic - text (Channel thematic)
is_adult - boolean interactive - boolean language - text (Channel language)
channel_mappings > Field names - Type (notes):
id - integer NOT NULL (sequential)
channel_id - integer (Channels key)
station_id - integer (Internal identifier)
call_letter - text (Channel call. Ex. "BBB E")
title - text (Channel complete name. Ex. "BBC Entertainment")
description - text (Channel description)
quality - text ("SD" or "HD")
available_on_channels - text (Platforms availability. Ex. "MEO_IPTV,MEO_PC,MEO_Mobile,ALL_IPTV")
start_date - date (included in DB)
updated_at - timestamp(3) with time zone (Last update)

BUSINESS SECTORS: Altice/MEO IPTV service
DATA FORMAT: Files
DATA TYPES: IPTV program guide
COLLECTION TERRITORIES: Portugal
PERSONAL DATA: No personal data
KEYWORDS: IPTV ; Electronic Program Guide
HISTORY: 2 years of EPG
STRUCTURE: Highly structured
CHALLENGE PAGE: Altice Labs Challange Page
Altice Labs - Programs Catalog Personalised Entertainment
DATA THEME: Programs Catalog
DESCRIPTION: Contextual data providing additional insight on aired programs, compiling both external and internal knowledge – complements/validates the EPG with additional information – e.g. program genre, director, cast.
Catalog > Field names (Examples)
Title (Ficheiros Secretos T8 - Ep. 2)
ProgramID (8728039)
MostRecentEPGStartTime (21-02-2017 01:50:00)
CachedThemeCode (MSEPGC_Others)
CachedThemeName (Outros)
Cast (David Duchovny; Gillian Anderson; Robert Patrick;)
Directors (Kim outros;)
EpgSeriesID (42002)
SeasonNumber (8)
SeasonTitle (Ficheiros Secretos)
SeasonTitleWithSeason (Ficheiros Secretos T8)
PresentationID (L2V_RTPM_8728039_210220170150)
Rating (N/CLASS)
StartTime (21-02-2017 01:50:00)
EndTime (28-02-2017 01:50:00)
OriginalChannelName (RTPM)
LastModifiedDate (27-02-2017 23:30:01)
OriginalTitle (Ficheiros Secretos T8 - Ep. 2)
ChannelThematic (Entretenimento)

BUSINESS SECTORS: Altice/MEO IPTV service
DATA FORMAT: Files
DATA TYPES: IPTV programs catalog
COLLECTION TERRITORIES: Portugal
PERSONAL DATA: No personal data
KEYWORDS: IPTV ; Program catalog
HISTORY: Under construction; only available in 2019
STRUCTURE: Highly structured
CHALLENGE PAGE: Altice Labs Challange Page
Altice Labs - Commercial catalog Personalised Entertainment
DATA THEME: Commercial catalog
DESCRIPTION: Description of IPTV commercial offer, detailing the content of its different services - linear and non linear TV packages, Video on demand, Netflix and value added services -, including associated costs and available bundling alternatives - still under construction.
BUSINESS SECTORS: Altice/MEO IPTV service
DATA FORMAT: tbd
DATA TYPES: Consumer data
COLLECTION TERRITORIES: Portugal
PERSONAL DATA: Anonymized data derived from personal data
KEYWORDS: IPTV ; Commercial catalog
HISTORY: Under construction; only available in 2019
STRUCTURE: Highly structured
CHALLENGE PAGE: Altice Labs Challange Page
Global Financial Services Provider - Company Filings Documents Text Mining and Analytics
DATA THEME: Company Filings Documents
DESCRIPTION: Set of companies financial disclosures.
BUSINESS SECTORS: Cross section of stock exchange listed companies.
DATA FORMAT: PDF
DATA TYPES: Companies financial disclosures
COLLECTION TERRITORIES: Global
PERSONAL DATA: No personal data
KEYWORDS: company filings ; company disclosures
STRUCTURE: Unstructured
CHALLENGE PAGE: Global Financial Services Provider Challange Page
Greiner (Greiner Packaging) - Polymer Production Data Smart Manufacturing
DATA THEME: Polymer Production Data
DESCRIPTION:
Source "Machine data": Process data (Sensor values from injection moulding machines), Alarm log (Logs of alarms and error messages frome the machines), Change Log (Logs of parameter changes at the machines) and "production data" BDE (Counter for the machines).
Source SAP Export: Qualifications (Production orders, time stamps for start/end, stops, reason for stops and scrap).
Source SAP/BW: Order (Summary statistics on orders: lot size, duration, productivity, …).
Source Environment: Sensors (temperature, humidity, dew point for various locations on the shop floor).
Source Facility: Facility Monitoring (Energy consumption , water consumption, cooling water, compressed air, ..).
Source ADAM Modules: Production Data Acquisition (PDA) (for each machine: on/off information + tacts).

BUSINESS SECTORS: Manufacturing of polymer parts
DATA FORMAT: CSV Files
DATA TYPES: Sensor data from machines, environment ; Production orders ; Quality data
COLLECTION TERRITORIES: Europe
PERSONAL DATA: No personal data
KEYWORDS: polymer production ; injection moulding ; manufacturing ; industry 4.0
HISTORY: From Q3/2017-Q2/2018
STRUCTURE: o Partially structured
CHALLENGE PAGE: Greiner Challange Page
Grow - Soil Sensor data Sustainable Food Supply Chain
DATA THEME: Soil Sensor data - CC-BY-SA licence
DESCRIPTION: The GROW datasets provide information regarding soil and plants data. The datasets are generated by validated users analysis (GROW participants), soil sensors disseminated in 9 hotspots throughout Europe and processed satellite data and imagery. Sensor data is updated every 24 hours and returns a historical collection of soil measurements recorded by sensors at 15 minutes interval.
BUSINESS SECTORS: Agrifood
DATA FORMAT: API - JSON
DATA TYPES: Geo-location, Timestamp of recording, Soil moisture, Light, Air Temperature, Fertilizer Level, Water Level
COLLECTION TERRITORIES: Austria, Hungary, Greece, Scotland, Spain, Ireland, Portugal, Netherlands, Luxemburg.
PERSONAL DATA: No personal data
CHALLENGE PAGE: Grow Challange Page
Grow - Land and Soil Survey data Agrifood
DATA THEME: Land and Soil Survey data - CC-BY-SA licence
DESCRIPTION: The GROW datasets provide information regarding soil and plants data. The datasets are generated by validated users analysis (GROW participants), soil sensors disseminated in 9 hotspots throughout Europe and processed satellite data and imagery. Edible Plant database and Soil and Land Survey data is a collection of data generated from the analysis of soil samples and classification of landscape characteristics, carried out by GROW participants at various locations - usually associate with sensor locations. It also provides static content regarding plants and regenerative practices information.
BUSINESS SECTORS: Agrifood
DATA FORMAT: API - JSON
DATA TYPES: Plant information (name, description, image, prefered ph, optimal temperature etc), Plant geo location, Plant regenerative practices, Soil regenerative practices, Ecosystem regenerative practices, Soil and Land Survey data (Geo-location, Soil texture, Stone content, Sediment layer image, Parcel size, Slope type, Slope position, Slope aspect, Land context and use, Land cover, Canopy cover, N-E-S-W and sensor/ground images, Timestamped land activity)
COLLECTION TERRITORIES: Austria, Hungary, Greece, Scotland, Spain, Ireland, Portugal, Netherlands, Luxemburg.
PERSONAL DATA: No personal data
CHALLENGE PAGE: Grow Challange Page
Grow - Sentinel-1 Mosaic Europe Agrifood
DATA THEME: Sentinel-1 Mosaic Europe
DESCRIPTION: The GROW datasets provide information regarding soil and plants data. The datasets are generated by validated users analysis (GROW participants), soil sensors disseminated in 9 hotspots throughout Europe and processed satellite data and imagery. Sentinel-1 Mosaic data comprise high-resolution radar-backscatter satellite imagery covering most of Europe.
BUSINESS SECTORS: Agrifood
DATA FORMAT: File - geotiff or netcdf4
DATA TYPES: Surface backscatter imagery at 10m sampling, describing surface and vegetation status
COLLECTION TERRITORIES: Europe
PERSONAL DATA: No personal data
CHALLENGE PAGE: Grow Challange Page
José de Mello Saúde metadata Healthcare
DATA THEME: José de Mello Saúde's metadata
DESCRIPTION: Client's demographic and geographic data, clinical activity data and medical information data, for surgery and inpatient care episodes that took place during the 2014-2017 period at a group of José de Mello Saúde's private hospitals.
BUSINESS SECTORS: Health
DATA FORMAT: File - CSV
DATA TYPES: Client's demographic and geographic data, clinical activity data and medical information data
COLLECTION TERRITORIES: Portugal
KEYWORDS: health ; customer profiling ; patient journey ; clinical pathways
HISTORY: 2014 to 2017
PERSONAL DATA: Anonymized data derived from personal data
CHALLENGE PAGE: José de Mello Saúde Challange Page
Konica Minolta - Customer needs anticipation Customer Needs Prediction
DATA THEME: Customer needs anticipation
DESCRIPTION: Answering the needs of customers is the ultimate challenge of every company. In the B2B sector, answering these needs with tailored solutions that answer the specific expectations of a client can be a real challenge : between the decision taker, the buyer and the user, needs, expectations and outcomes can vary drastically. Given the knowledge on clients that large entreprises have nowadays, it is forseeable to build a system that could anticipate and create tailored solutions to any client’s challenge. The issue is to be able to aggregate the data, coming from various applications and tools of the information systems, and create algorythms that will ease the ability of a company to anticipate its client’s needs and expectations. The question is : given a client’s typology, how can a products/services providers anticipate the needs of a client, being able to offer the solutions that will answer the client’s present and future challenges ?
BUSINESS SECTORS: Commerce, Distribution, E-commerce ; Connected Objects ; Telecommunications, Technology and IT
DATA FORMAT: Files
DATA TYPES: Sale volumes, Marketing and commercial data, Shopping carts
COLLECTION TERRITORIES: France
PERSONAL DATA: Anonymized data derived from personal data
KEYWORDS: Customer insights ; Sales anticipation ; Needs anticipation ; Sales tools ; Customer intelligence
HISTORY: 2014
STRUCTURE: Not structured (never applied the resources to structure it)
CHALLENGE PAGE: Konica Minolta Challange Page
MASAI - Taxi and transfers Multimodal Transport
DATA THEME: Taxi and transfers
DESCRIPTION: Local hop-on/hop-off, transfers and tours provider.
BUSINESS SECTORS: Transfers & touristic/leisure activities.
DATA FORMAT: Files - JSON
DATA TYPES: Consumer data
COLLECTION TERRITORIES: Portugal
PERSONAL DATA: No personal data
KEYWORDS: Transfers ; Tourism ; Activities
STRUCTURE: Highly structured
CHALLENGE PAGE: MASAI Mobility Challange Page
MASAI - Long Distance Bus Multimodal Transport
DATA THEME: Long Distance Bus
DESCRIPTION: Long-distance bus concierge/aggregator which wishes to see more standardised interfaces among the different service providers it aggregates.
BUSINESS SECTORS: Buses
DATA FORMAT: Files - JSON
DATA TYPES: Consumer data
COLLECTION TERRITORIES: France, Italy, Croatia, Serbia
PERSONAL DATA: No personal data
KEYWORDS: Long-distance ; Buses ; Aggregator ; Concierge
STRUCTURE: Highly structured
CHALLENGE PAGE: MASAI Mobility Challange Page
MASAI - Local Buses Multimodal Transport
DATA THEME: Local Buses
DESCRIPTION: Local transfer provider.
BUSINESS SECTORS: Transfers & touristic/leisure activities
DATA FORMAT: Files - JSON
DATA TYPES: Consumer data
COLLECTION TERRITORIES: Portugal
PERSONAL DATA: No personal data
KEYWORDS: Transfer ; Tourism
STRUCTURE: Highly structured
CHALLENGE PAGE: MASAI Mobility Challange Page
MASAI - Car sharing Multimodal Transport
DATA THEME: Car sharing
DESCRIPTION: Car-sharing service provider.
BUSINESS SECTORS: Car-sharing
DATA FORMAT: Files - JSON
DATA TYPES: Consumer data
COLLECTION TERRITORIES: France (Nice)
PERSONAL DATA: No personal data
KEYWORDS: Car-sharing
STRUCTURE: Highly structured
CHALLENGE PAGE: MASAI Mobility Challange Page
MET Office - Pollen Forecast Data for the United Kingdom Weather and Climate Change
DATA THEME: Pollen Forecast Data for the United Kingdom
DESCRIPTION: 5-day pollen forecast for all regions of the UK. 16 regions are available to cover the whole of the UK. This is all pollen types grass, tree, nettle and fungal spore pollen.
DATA FORMAT: CSV Files
COLLECTION TERRITORIES: UK
PERSONAL DATA: No personal data
KEYWORDS: Pollen data ; Pollen forecast
CHALLENGE PAGE: MeT Office Challange Page