Authentication and authorisation Data distribution Transforms Query engine Message gateway Subscription store manager

Technical

This section describes the technical details of the Discovery Data Service; the architecture that underpins the data service software, the components that make up the service, the testing and assurance processes, and the resources, technologies, and software that are used to host, develop, and support the service.

For details see:

Architecture

The following diagram explains the underlying architecture and the basic publication to subscription pathways that data must follow.

Publisher data is identified by organisation and software format/version. For example, EMIS CSV 5.6 or Adastra XLM 1.0.

Note: It is assumed that published data will be accompanied by an ODSOrganisation Data Service (NHS) code, or similar.

Important: Existing data processing and sharing agreements are checked to validate that the DDS has permission to process data from that organisation and in that format/version, and then share that data with specified subscribers.

Secure publication to the DDS

Data is transferred, via HSCNHealth and Social Care Network, into the DDS HSCN connected AWSAmazon Web Services instance in one of the four following ways:

SFTP Pull

GPSoC IM1 pairing directly with the publisher’s system supplier. The data is pulled by the DDS from the suppliers SFTPSecure File Transfer Protocol and then onto a publisher specific SFTP folder structure in Discovery’s AWS instance over a TLS 1.2 encrypted link. Data from EMIS is pgp encrypted at source.

SFTP Push

The data is pushed from the publisher's SFTP into the DDS and then onto a publisher specific SFTP folder structure in Discovery’s AWS instance over a TLS 1.2 encrypted link.

HTTPS Post

The data is posted to the DDS and the onto a publisher specific SFTP folder structure in Discovery’s AWS instance over a TLS 1.2 encrypted link.

MLLP Send

Messages are posted over VPN TLS 1.2 encrypted links into the DDS HL7 receiver and then onto Discovery’s AWS instance.

AWS Security

As a hosting organisation, AWS has in place over 50 security and compliance certifications including ISO 27001, 27017, 27018; PCI DSS; SOC 1/2/3; and Cyber Essentials Plus.

The DDS is built on this platform and makes use of multiple enhanced features to secure the data; these include encrypting all networks to at least TLS 1.2 with high strength ciphers, ensuring all data at rest (such as in databases) or data in transit (such processing queues) is encrypted, administrative access is controlled using VPN with 2FA, access controls, and all access to the systems is logged & audited.

The system has been designed to exceed the requirements set out in the NHS Data Security Protection Toolkit.

HSCN Certification

The Discovery platform has an HSCN to AWS connection that has been certified. The connection is provided through our IG agreement with Tower Hamlets CCG ODS code 08V.

Components

The following diagram illustrates the various logical components that make up the Discovery Data Service:

APIs

The Discovery test APIs help our industry colleagues to develop new health and social care products and applications by accessing test patient data, from multiple health and social care sources, stored in the Discovery Data Service (DDS).

We believe that this will also increase the benefits realised to those publishing to and subscribing from the DDS.

If you want to test our APIs against your own software, register your interest.

API Details Notes
Get structured record Returns a complete patient record in a structured format following GP and Care Connect standards. All data requests for information hosted with DDS are linked to a project data set. The project data set is linked to a data sharing agreement from the publishing organisations. A project data set also governs and limits the type and nature of the data required by the subscribing system. For example, a diabetes system would request only diabetes data from DDS and therefore the data would be linked and filtered by a pre-authored diabetes data set, containing record type definitions such as Conditions, Medication, Allergies, and Diabetes related Observations, and value set definitions such as the national diabetes code set.

Access the following further test Discovery APIs at:

Test - https://devgateway.discoverydataservice.net/eds-api/

Important: You must have a valid discovery username and password to obtain a valid authentication token. You must have a valid authentication token to call the secure APIs.

API Details Notes
Get flag for patient Returns the current state of the given flag for the given patient, for example frailty.  
UPRN lookup Used to return a unique property reference number, or an unmatched response, from a patient address.  

Important: The APIs and algorithms will only use data, return resources, or identify patients that you have protocols, agreements, and access for.

Management tools

The Discovery Data Service hosts a number of web applications and web application APIs for managing the data service.

We have a skilled and experienced team developing tools and utilities to maximise the benefits of a single source of data.

The tools, which are in various stages of development and currently being prioritised in line with the use cases being presented to the East London Discovery Project, include:

Query & Reporting Tools

The Discovery Data Service hosts a number of web applications and web application APIs for record viewing and analytics/reporting purposes.

We have a skilled and experienced team developing tools and utilities to maximise the benefits of a single source of data.

The tools, which are in various stages of development and currently being prioritised in line with the use cases being presented to the East London Discovery Project, include:

Published data

Primary Care GP Secondary Care Unscheduled Care Community & Mental Health
EMIS Web SystmOne Vision Cerner Millenium EMIS Web Medway Adastra EMIS Web Rio
Barking & Dagenham
31
Barking & Dagenham
1
Barking & Dagenham
2
Homerton Hospital Trust** BH1 Community Heart Failure Nursing BHRUT* The Hurley Group Patient First East London Foundation Trust*
Brent
5 51*
Central London
28 6*
Havering
2
St Barts Hospital Trust*** BH1 Community Cardiovascular Nursing CHUHSE City & Hackney GP Confederation North East London Foundation Trust*
City & Hackney
40
Ealing
36 41*
Redbridge
1
BH1 Community Dietitians PELC*
Ealing
1 1*
Hammersmith & Fulham
21 8*
BH1 Community Neuro Team Newham OOH*
Harrow
0 33*
Hounslow
24 22*
BH1 Diabetes Service London Ambulance Service (NEL)*
Havering
41
Redbridge
6
BH1 IMAPS + MSK Physiotherapy
Hillingdon
1 45*
Waltham Forest
5
BH1 Renal Service
Hounslow
0 1*
West London
25 17*
BH1 Specialist Children's Services
Hurley Group
11
BH1 Community Diabetes Service
Newham
50
BH1 Community Diabetes Service
Redbridge
34
Tower Hamlets
35
Waltham Forest
35

Numbers indicate the number of GP practices that are sending live data.

Numbers and text in red * indicate data that is not yet live.

**ADT feed only.

***ADT feed plus daily file feed that includes CDECommissioning Data Extracts from Power Insight (Millennium data warehouse software) and CDSCommissioning Data Sets files.

1 Barts Health Trust

Current data sets

The following table shows the datasets that are currently published by different supplier systems:

  EMIS Web SystmOne Vision Adastra Cerner Millennium
Rio
Data sets
Primary care GP
Primary care community
Primary care GP
Primary care community
Primary care GP
Out of hours
A&E
Inpatient
Outpatient
ADT feed
Community
Allergies
Appointments
Care episodes
Diagnoses
Documents
Encounters/consultations
Family history
Follow ups/diary recall
Free text
Immunisations
Inpatient activity
Medication
Non-scheduled activity
Observations1
Outpatient activity
Patient demographics
Problems
Procedures
Questions & answers
Referrals
Templates
Test orders
Test results/
Pathology results

1includes signs, symptoms, biological values, and pathology.

= Data sets available and published to the DDS

= Data sets available but not published to the DDS

= Not applicable

= Data sets coming soon

FHIR profile resources

The following list shows the FHIR profiles that we currently support:

See also:
Get FHIR profile resource type
GitHub FHIR Profiles repository

Data assurance & testing

The Discovery Data Service receives data from publisher organisations that could have:

  • Multiple IT systems that are responsible for capturing the data.
  • Multiple mechanisms of publishing the data to the DDS.
  • Several message or file formats.
  • Multiple content taxonomies or code schemes, some standard and some local, and some with no code schemes at all.

This many-to-many relationship between the technical data exchange formats and an individual patient record creates a significant challenge when we try to aggregate and link the data for direct care and secondary uses.

The approach that we have taken consists of testing each and every system, extract mechanism, format, taxonomy and code scheme independently against clinical or operational scenarios to make sure that the system is fit for purpose.

It should be noted that when data is moved from one system to another it always loses some data or some context; this is referred to as degrade. The objective of the assurance process is to prove that the mechanism involved in transferring the data is fit for purpose and that the data content is good enough for the subscriber use cases.

The starting point of the overall assurance process is a publisher data entry scenario, and the end point is a subscriber data usage scenario; test scenarios and test packs are derived from the overall specification, modified by the system extract capability, and narrowed to the scenario of interest. There is no requirement to test the entire data service at any point.

Integral to the Discovery Data Service is a comprehensive set of tools for monitoring all access and configuration changes on our cloud platform, we log everything a user does from sign in/sign out to any change to a server or configuration item to create a complete audit trail. Access rules are continually reviewed with only minimal permissions applied unless a change is scheduled to be made. All access is governed by the Clinical Effectiveness Group (CEG) Barts & the London Queen Mary University Information Security Policy.

Notifications are generated as data is received at any of our endpoints, this data is then processed into the service with audit logs generated along the way to ensure accuracy and consistency.

Our data testing and assurance team have developed an advanced audit log, which is implemented in every transform, meaning that we can view every transform result to make sure that all data published into the DDS is validated at every step in the process and to allow technical and clinical teams to validate and sign off the accuracy of the data feed.

Technologies

The Discovery Data Service uses the following resources, technology and software:

For technical and API queries, please contact info@discoverydataservice.org