Saturday, July 31, 2010

The Learning Registry: A First Look

The federal government plans to make its educational resources much easier to find and use through a new Learning Registry.
The project will make federal learning resources, wherever they are stored, easier to find, easier to access and easier to integrate into learning environments around the country and world. The Learning Registry will enable teachers, students, parents, schools, governments, corporations and non-profits to build and access better, more interconnected and personalized learning solutions needed for a 21st century education.

The project will accomplish this goal by identifying and implementing existing technologies and interfaces, as well as exposing learning resources to search engines so they are easily discoverable. We want to ensure that innovators, educators and the general public can access these resources easily by enabling the development of learning solutions utilizing federal assets in new and unanticipated ways.

If the project is successful it will not only enable access to federal resources, but it could , lead to more consistent methods for accessing learning resources, regardless of source or provider. The end result will be a network of discoverable federal resources that can be used in a wide variety of applications, including textbooks, open educational resources, mobile applications and online learning products. We will invite other learning resource providers to participate in the network, so that students and teachers have access to effective resources from other sources as well.


Introduction
On Wednesday, July 21, 2010, at the Rural Education Technology Summit, Secretary Duncan of the US Department of Education, Secretary Clough of the Smithsonian Institution and Chairman Genachowski of the Federal Communications Commission (FCC) jointly announced (video here) the establishment of the Learning Registry for the federal government to make resources from various government sources easier to find and use. This is the official start of a major project that has been under investigation, and leverages and amplifies the on-going work of many other organizations, governments and individuals.

This discussion of the Learning Registry represents the best thinking of a number of people and agencies at the federal government and beyond. But this work is in its early stages and we need the help of many people to get it right. We welcome your involvement. You can communicate with the Learning Registry project team at this IdeaScale site. We will be monitoring and participating in conversations there. Please let us know what you think, and rate and discuss the ideas of others.

What is the Learning Registry and who is involved?
The Learning Registry is a project among a (growing!) number of federal organizations including Department of Education, Department of Defense, National Science Foundation, the Office of Science and Technology Policy at the White House and the Federal Communications Commission. We’ve been consulting with the National Institute of Standards and Technology (NIST) at the Department of Commerce, the National Archives and Records Administration, the data.gov team, as well as both the Federal CIO and CTO. We’ve had preliminary conversations with NASA, the Smithsonian and the Department of Energy. We plan to engage with many other agencies and organizations in this work, both inside and outside of the federal government. In the initial phase, this project relates only to the learning resources managed by the federal government (and by learning resource, we’re currently taking a broad view to include primary source, historical and culturally significant materials, as well resources designed specifically for educational use).

Secretary Duncan talked about how this system can benefit educators during the announcement, and I will expand on his remarks a bit here:

Let’s imagine you’re a high school physics teacher or the head of an online learning company. In either case you might want to build a course on the early years of the US space program in way that integrates history, writing and physics. You might want to use resources that are available from the federal government in this work. In searching for those resources, you learn that each agency has its own repositories (often many of them) and you have to search each site to find the materials. Even internet search engines are of limited (though still significant) help. Finding the right information stored at these different agencies requires significant web research expertise. At this point today you might give up your search because it will take too much of your time to find the resources you need.

The point of the Learning Registry to is make it much easier to find and access these federal assets. The benefits of finding resources stored in the federal government can be significant. In this case the Learning Registry would help you discover that NASA has numerous photographs of the missions and related information. The National Archives has the transcript of an historic phone call between Nixon and Armstrong, while Armstrong was on the moon. NOAA has a history on their weather program that supported the early rocket launches and landings. The point is that it shouldn’t be hard to find related learning resources around government, just like it shouldn’t be hard to find similar resources from around the internet.

We aren’t trying to boil the ocean with this project, and can’t solve every issue ourselves. In particular, we want to be sure there is room for others to participate and innovate beyond what we are able to do. Recognizing this, we plan to share learning resources within the Learning Registry in a way that makes it much easier to find and access them. One important technique is to enable the categorization and sharing of resources based on information beyond that made available by the original publisher. In the trade this is called “metadata augmentation” and if we do this project right we can enable websites, users, learning management systems and others to classify, rate, review and re-publish federal resources in way that helps more people find the resources most useful for their needs. The idea that “none of us are as smart as all of us” underlies this strategy.

We also aren’t planning to develop a sophisticated portal to access the resources—we think others in the field can do this better than we can. We aren’t going to invent new algorithms that direct users to the most relevant resources: that’s a job best left for the experts in industry. Our goal is to publish our resources along with as much information about the resources as possible. The community will make use of them however makes sense to their users.

Some Hard Questions to Answer
I mentioned some hard problems and let me outline some of them in more detail:
  1. We need to aggregate and “cross-walk” metadata (i.e. classifying information about each resource) between agencies, so that a single search can turn up relevant resources from all agencies. This is a hard information problem and will probably involve a lot of elbow grease, as well as some sophisticated technologies.
  2. Building on social networking concepts, we also want anyone to add additional metadata about the resources without having to run their metadata by anyone first. (Introduced above as “metadata augmentation.”) For example, if you are a school district in Arizona and you integrate NASA materials into one of your online courses, you might find that you are using the resources in a significantly different way than how they are categorized by NASA (they might identify the content as historical resources, but you find that they can be used to teach physics). You might digitally post your categorization of these resources, and your information should inform someone else’s search for physics resources in the Learning Registry. This concept raises a number of hard problems such as:
    1. How can such a system perform well? Is there such a thing as “too much metadata?”
    2. How can searchers/search engines distinguish between high quality and lower quality metadata? What about “spammy” metadata? When is “quality” an absolute measure and when is it relative (to searcher, to category, to applied purpose, etc)?
    3. How can we ensure that the source of any metadata is authentic?
  3. In addition to metadata that categorizes or classifies resources being published from any source, we also want usage-based metadata (what the National Science Digital Library has termed paradata) to be publishable and findable within the network.
    1. So if you are in a school using resources in your learning management system (LMS), there should be an easy way for the LMS to optionally publish anonymous data about how the resources are being used, so others can benefit from your experiences.
      1. This might be like how many software companies today ask you when you install a program if you will allow anonymous data to be sent back to the company to improve their products. In this case, we don’t want the data to be sent back to any specific company—we want it to be available for consumption by any user, company or researcher.
  4. We want the system to be very scalable and have high performance. Flexibility and performance are often at odds in software design (I’ve heard programmers say “Cheap, flexible, fast—pick any two”). How can we make a system that is scalable, fast, flexible and cost effective to develop, use and maintain? We think we can do this by leveraging the hard work and investments already made by many others including the National Science Digital Library, Department of Defense’s ADL Registry and CORDRA architecture, the European SchoolNet, Globe, Ariadne, IMS, SIFA, Connexions, OERCommons, the UK Joint Information Systems Committee, the Australian national government, California Digital Library and many, many others. (Our apologies if we’ve omitted anyone we’ve been working with). If you know of a project that we should know about, let us know who they are.
    1. One example of a potential performance problem is federating discovery of resources from the entire network.
      1. How can we get metadata augmentation from many different sources in a way that doesn’t create gatekeepers but also permits high performance searches? From what we’ve heard so far this is going to involve “harvesting” of some kind. Tell us what you think about this.
There are many other challenging problems including (but not limited to): uniquely identifying learning resources, dealing with license and access limits for some resources, relating learning resources to curricular standards, providing tools to integrate the Learning Registry into both the social networking world and the world of linked data. Let us know what other problems you think we need to look at.
Here’s a picture that lays out some of the hard problems graphically (thanks especially to Richard Culatta at CIA for his graphical contributions to this).

Please keep in touch as you come up with ideas or resources that can help us accomplish these goals. We really think this project has the huge potential to make a difference for teachers and students across the country, as well as the companies, non-profits, school districts, states and others who support the students and schools in the US.
About the Author
Steve Midgley serves as the Deputy Director of The Office of Education Technology at the US Department of Education, and one of the main projects he is leading is the Learning Registry.

No comments: