Data Quality Assessment Check List - Wockets

advertisement
Data Quality Assessment Check List
Please go through the list in order to assess the quality of a data set. Please try to do this check no later than one day
after the data collection session (Stanford) or one day after the data has been sent to you to double check (MIT).
Initial and date when each test is completed. In “Special Observations” note any anomalies you encounter.
Data set ID (type ID):
Stanford checker (type name and email):
MIT checker (type name and email):
Please conduct each set of tests. Type your initials in the completed column for each when it is done.
Test
Completed?
Special Notes

If you were present during data collection, think of anything unusual that
happened. Check the notes file and make sure these details are present in as
much detail as possible (e.g., “Wocket ABCD dropped off the ankle roughly
between 11:45 and 12:53 during the treadmill activities.”)
Directory Structure & Files

Check the data collected from all sensors are located in the appropriate folder
according to the directory structure described in the wiki here:
http://wockets.wikispaces.com/Files+and+naming+conventions .

If RTI sensor data was collected, upload the corresponding files to the MIT
RTI server (IP address: 18.85.47.105).

Backup the existing “ActivityLabelsRealtime.xml” file (e.g. renamed to
“previous_ActivityLabelsRealtime.xml”).

Copy the new XML “ActivityLabelsRealtime.xml” (new protocol) from
http://wockets.wikispaces.com/ActivityLabelsRealtime to the wockets folder (
e.g. copy to “ …/Jan1310/wockets” folder).

Make a copy of the existing “Session/annotation/audioannotation” folder and
renamed to “previous_audioannotation_date” (where “date” is the date when
you made the copy).
Sensor Placement

Check that the sensors placement used in the experiment is consistent with
what is recorded in the SensorData.Xml and Notes.doc files. This check has to
be done for Wockets and MITes sensors, if both were used.
 Correct the files.
Annotations (Part 1)

Open the Wockets Annotator Software and load the annotated activities.
Check that there are not inconsistencies with the assignment of labels or
anything unusual. Some errors will be highlighted red. These may require
manual adjustment (e.g., if the old version of the ActivityLabelsReadTime
had misidentified or misspelled activities without names identical to those in
the current protocol, the software will require that a valid annotation label be
reselected).

Check that the “jumping jacks” markers are included at the beginning and end
of session and, on newer datasets, that the “Moving arm only” and 4 other
Wocket position checks are included.

Check that the last labels have appropriate start and end times and date. Be
particularly careful if there has just been a change in daylight savings time.

If any corrections are made, save the session and generate a new XML file.
Annotations (Part 2)

Using the annotator summary window created after clicking the “generate
Xml” button, confirm that the annotation time for both the postures and the
activities is normal given what is expected. For instance, all major activities
that were done for multiple minutes should be present and have approximately
the right number of minutes.

Using the annotator summary window, compare the list of “annotated
postures and activities” and “no annotated postures and activities.” Make sure
that there are not missing or incorrect labels in the list (for example, activities
that were definitely performed, but they don’t appear on the corresponding
list). If there is a problem, it may be necessary to clean up the annotations or
to edit the

If there are any important comments written in the session notes, write them
in the notes file.
Merged Data

Remerge the session data using the Wockets Merger Software.
Quality Assessment Results (Part 1: non-Wockets/MITes devices)

Open the quality assessment report generated by the merger (“results.html”).

Within the quality assessment document, check which devices were merged.
If any devices are missing, indicate them in the notes file. If devices are
legitimately missing because of equipment breakdown, check that the
explanation appears in the notes file. If not, indicate the reason is unknown.

Check the percentage of samples and time collected for each type of sensor
(Zephyr, Actigraph, Oxycon, RTI, Columbia, GPS). Indicate in the notes file
if there is anything suspicious or unusual.

Check the start time and start date for each of the devices and make sure
there is not a discrepancy, especially a 1-3 hour shift from what you would
expect to see.
Quality Assessment Results (Part 2: Wockets and MITes data)


If MITes were used, check that the MITes sampling rate for each placement is
more than 45%. If not, indicate which are below 45% in the notes file. This
check has to be done only for the old data collections in which MITes were
included.

Check that the Wockets data loss is NOT more than 5%. If there are Wockets
with more that 5% data loss, indicate their IDs in the notes file.

Go the Wockets data lost per activity table and confirm that there are not
activities completely missing or that have an unusual small amount of seconds
of data. If there are, please indicate this in the notes file. Either explain or
indicate that there is no know explanation.
Data Visualization (Part 1: Overall check for missing data and gaps)

Visualize the data collected during the session using the Wockets Viewer
Software. (Make sure the software uses the full screen!)

Verify that all successfully merged devices appear in the visualization graphs
with data points showing. Completely missing devices should be indicated in
the notes file. (Note: data that supposed to be present but not plotting often
indicates there is a timestamp problem, causing the data to not appear.)

Check that there are no obvious offsets present in the data (e.g. missing data
for one or two hours). If anomalies are found, indicate them in the notes file.

Verify that the annotation labels are correct by checking that the start and end
time of the whole annotation stream. Subsequently, check that the annotation
stream contains “jumping jacks” labels at the start and at the end of the
session. For later datasets, check that the “Moving [sensor location] only”
activities are present.

Get a global view of data and visually confirm that there are not large
unexpected data gaps or sensors that clearly have a quality issue that has not
already been noted. Add anything noticed to the notes file.

If there are any unexpected gaps in the annotations, especially large ones, load
the annotation software and check that nothing is wrong. Add an entry to the
notes file with an explanation.
Data Visualization (Part 2: Overall check for annotation label timing)

Uncheck the visualizer check boxes for the MITES (if collected in the
session), Colombia, RTI, SenseWear and HeartRate sensors. This will make it
easier to see the remaining plots of the sensor streams.

Verify within the jumping jacks data segments that three peaks in the
acceleration data are shown, on per jumping jack. To do this, first select the
segment labeled as “jumping jacks” and, then, click on the “raw view” button
located at the right lower corner of the Wockets Viewer. (This confirms that
the annotation computer clock was synced.) If you don’t see three peaks, add
an entry to the notes file.

Go back to the global view of all the sensors. Verify that the patterns of data
signals cluster within activities (e.g. walking/running on treadmill, bicycling
and lying on back). Check that the start and end annotation times look
accurate, i.e., that there are not activities where the label is clearly too short or
two long based on what you can see in the signal about the activity transitions.
For example, running on treadmill should maintain a steady pattern of high
values in all sensors. In contrast, lying on back should maintain a steady
pattern of low values in all sensors. The transitions between different activities
should line up with the annotation transitions. If you see anything suspicious,
add an entry to the notes file.
Data Visualization (Part 3: Confirm sensor location labeling)

Verify that the Wockets and MITes sensor placement labels make sense given
specific activities.
For newer datasets with the “Moving [sensor location] only” annotated
segments, highlight these segments and zoom in. The AUC (Area under the
curve) red line should have the greatest value for the segment that is marked
as moving. Confirm that each of the five sensors was in the location where it
was labeled. If there is a discrepancy, make a highlighted entry in the notes
file about which locations may be swapped based on this test (this is a serious
problem that will require some manual editing of files to fix!).
For older datasets without the “Moving [sensor location] only” annotated
segments, do the checks described below.

Thigh and ankle check. Go to the “biking” activity and check that the
wrist, hip and upper arm are not showing a significant amount of
movement by looking at the AUC metric (red dots). In this case, thigh
and ankle should register higher movement than the wrist, upper arm and
hip. The ankle should be moving more than the thigh. This check will
confirm that the thigh and ankle placement is what is expected to be.
Note that you must check the values by pointing at some of the dots,
because the relative scales of the axes are different between different
sensors.

Hip check. Go to the “painting with brush” activity and check that the
wrist and upper arm are moving considerably more than the hip using the
AUC curve.

Wrist and upper arm check. Go to the “painting with brush” activity
and check that the wrist is moving more than the upper arm. If it is not
clear one is more than the other, check the “sweping” activity.
If any of these checks fail, make a note to the right and also in the notes
file about which locations may be swapped based on this test. This is a
serious problem that will require some manual editing of files to fix!
Dataset Checking Hand Off
For Stanford:
 Upload the data to the MIT server in the StanfordChecked directory.
 Send an email, attaching this file (with saved initial info) to the MIT team
indicating the data is ready for checking.
 If you don’t get a final confirmation from MIT that the data is checked within
TWO days, send another email asking about the data check status.
 Once you get confirmation the data has been checked from MIT, download it
from the MIT server’s Final-AllChecked directory. Store it in a convenient
location.
For MIT:
 Add this file (with saved initial info) to the data directory.
 Zip the session folder verifying that the zipped file has the appropriate folder
structure (“Session/Session/wockets”).
 Move the data to the Final-AllChecked directory on the server.
 Make a backup of the entire dataset to a DVD. Label it and give it to Stephen.
 Send an email to the Stanford team confirming the data collection for this
session is complete.
Download