Feedback on Proposed Changes for Data Delivery
From RAJAR Wiki
We would welcome your reactions and comments on the feasibility of introducing the following series of proposed changes by next Monday (10th October), in advance of possible further discussions at Tuesday's DUG meeting.
Background
As a result of discussions at the last Data Users Group and subsequently with individual bureau, it is proposed that the format of the support files and check data for RAJAR's RLD should be revised. The aim is to make it easier for users to load and verify this data in advance of receiving final listening records. For reference, we also operated an in-house system for June's release that mirrored the set up procedure of some bureaux.
In addition, the RAJAR Board would like to reduce the lead-time between the release of RLD listening records at 5.30pm on the evening of release day and, the distribution of data to buyers and planners approximately seven working days later. Discussions with Mediatel and Telmar suggest that a typical timetable for the release comprises:
- 1-2 days loading and verifying data
- 2 -3 days resolving data queries and reconciling results
- 2 days to copy and distribute final results to users
- These timings need to be amended to reflect the agreed industry Trading Day, rather than 7 days. Trading Day is currently the second Monday after data delivery - this is actually 18 days. Plus we have also seen data issues right up until a few days before Trading Day over previous quarters, and they are still ongoing for Q2 2005!! If we are expected to action these, it would seem unwise to move the Trading Day.--Emilyp 12:14, 6 Oct 2005 (BST)
- Following distribution, Client companies currently have a further week to load data internally - which can take them a little while. Again, needs confirmation but in our experience some companies have been known to miss the trading day to complete this work.--Emilyp 12:15, 6 Oct 2005 (BST)
The provision of more data in an easier to use form with a prompt response to any queries should reduce some of this timetable. The next step may then be the distribution of respondent data, excluding listening records, in advance so that only listening data is appended on release day. It is hoped that a combination of these changes could reduce release by up to a week for December's results.
As an initial step, the following note gives proposed changes to the file formats that may be introduced from September's release. Coincidentally with these possible changes, we will also be introducing additional data checks specifically to replicate Mediatel's system, further check data and have a new procedure for responding to queries.
- The quality of the data is paramount to us, and we utilise a lot of time before Trading Day to implement as many checks as possible to this end. If RSL could define what checks they are implementing, as mentioned, this would be appreciated.--Emilyp 12:44, 6 Oct 2005 (BST)
A reduction to the timetable is likely to depend on the universal introduction of these changes and These should be sent to both Tony Hughes, one of our senior analysts, and myself who will be managing the production of September's RLD and the introduction of the changes. Email contact details are:
- John Stockley: john.stockley@ipsos.com
- Tony Hughes: tony.hughes@dsl.pipex.com
- It seems prudent to have a final cut off date at which no more changes can be made. For example, the respondent level data without listening events will be loaded and checked. The ensuing file complete with data will be assumed to be cleaned and correct - once loaded no further changes will be allowed as this will make meeting new deadlines impossible.--Emilyp 12:16, 6 Oct 2005 (BST)
- How much sooner will data be distributed - including segement files and check figures? A schedule of all files, including the respondent data without listening events, and proposed timings would be good.--Emilyp 12:20, 6 Oct 2005 (BST)
- We would be happy to make some improvements here and begin to implement the new file formats, but realistically require at least two test runs before considering any movement of the Trading Day. There are no guarantees of time savings so it would be ideal if all systems could begin to use the new files and report back on time savings or problems. This should be repeated a second time - using the new files alone, by all parties before they are adopted by the industry. At this point also, systems can feed back if any changes to Trading Day can be made.--Emilyp 12:20, 6 Oct 2005 (BST)
- Perhaps consideration could be given to releasing data on a Monday rather than Wednesday, as the weekend days could interfere with processing. It is much more likely to speed up the process if data is released first thing in the week.--Emilyp 12:23, 6 Oct 2005 (BST)
The main changes to file formats can be summarised as follows:
All new files would be in tab delimited text format
Feedback has indicated that the current mix of formats where some files are in csv format, some in text, some in Word documents, some in Excel spreadsheets etc, is making the loading and verifying process more complex and time consuming. Therefore it is proposed that all files should be in tab delimited text and all extraneous formatting removed.
Originally we had considered csv format as the standard. However, some users pointed out that occasionally station names included commas in the text. In these instances, using the csv format would cause errors in the support files.
Where users have problems with the tab delimited format, and prefer csv format, most modern text editors will allow them to make a global change to convert tabs to commas.
New Data Link
There have been difficulties matching data from different support files using the text of the station name. If the text varied by as much as one character, automatic matching was impossible, and so manual intervention was required.
Where relevant, all new files would be indexed by a unique 4-digit code, which should make matching data between files seamless.
Calculation of Station TSAs
There would be a new support file called "tsa.txt" This contains the TSA definition for every published station for each of the four quarters in the current release.
This would make it quicker and easier for users to match the published check data.
Test Files
A draft set of support files in the proposed format and based on June's release is attached to this email. A dictionary is included as a guide to the contents.
John Stockley
4th October 2005
- Could you please confirm when the files will be used from, and if you envisage sending only new formats or parallel run with old formats for a period?--Emilyp 12:25, 6 Oct 2005 (BST)
- Is it possible to merge the check figures files into one with flags (extra columns) to indicate weighted and unweighted data and whether it is yearly, half yearly or quarterly?--Emilyp 12:25, 6 Oct 2005 (BST)
- There was an idea to include a file that has both RSL codes, and segment definitions within it, will this be possible?--Emilyp 12:26, 6 Oct 2005 (BST)
- Some definition of the PC Based Planning Groups is required. There seems to be a proliferation of these groups, and inclusions now of 'Out of Analogue Areas'..etc, all of which appear to be traded. I'm not exactly sure on the purpose of this file or how it should be treated. Furthermore, loading this ever increasing list will impact on timescales - particularly as it is probably a manual process. It would be good if all traded groups/stations, plus macros, could be included within the Rep List, with those stations/groups which appear on the Quarterly Summary be clearly flagged.--Emilyp 12:26, 6 Oct 2005 (BST)
- Will there be a 'Release Changes' document detailing, for example, any changes to your 4-digit report numbers?--Emilyp 12:26, 6 Oct 2005 (BST)
