Search   |   Back Issues   |   Author Index   |   Title Index   |   Contents

Commentary

spacer

D-Lib Magazine
April 2006

Volume 12 Number 4

ISSN 1082-9873

The Impact of Mandatory Policies on ETD Acquisition

 

Arthur Sale
University of Tasmania
<Arthur.Sale@utas.edu.au>

Red Line

spacer

Abstract

This paper analyzes the data now available in Australia's coordinated Electronic Theses and Dissertations (ETD) gateway to show the impact of high-level institutional policy decisions on population of the individual repositories. The paper shows that just like research article repositories, voluntary ETD deposition results in repositories collecting less than 12% of the available theses, whereas mandatory policies are well accepted and cause deposit rates to rise towards 100%. Modeling of the Ph.D. and Master process in Australia is also carried out to indicate the delays and liabilities to be expected if mandatory policies are applied only to newly enrolled candidates.

Participation and software

The Australian university system has a common gateway to electronic theses and dissertations (ETDs) produced in its universities. This gateway, the Australian Digital Theses Program (ADT Program, 2005), harvests from 29 (74%) of the 38 Australian universities (there are 30 repositories, but two of them belong to one university), and is in its turn harvested by Google, Yahoo! and scholarly search engines. The remaining 9 universities have not yet established local ETD repositories. Nevertheless, the Program has a potential coverage of 84% of Australian university research degrees, since many of the non-contributors are smaller universities without strong research programs.

The participation rate must be regarded as a significant success by the ADT Program in a short time. An agreement has recently been reached with CONZUL (the Committee of New Zealand University Libraries) to have the eight New Zealand universities also contribute ETDs, but only one has as yet established a repository. The current situation is shown in Figure 1. The ADT Program has been running since 2000, for a sufficiently long time that an analysis of its success can be undertaken as a work in progress.

Pie chart showing ADT program participation

The gateway harvests metadata from (local) leaf repositories in the universities for its own database. Searchers who find the gateway metadata pages either through a search engine or via a direct search of the gateway are directed to the leaf repository, which is the only place that the 'full-text' is stored. While the ADT Program is oriented around text documents that are required to be in pdf format, a few theses have associated materials such as software, again only stored in the leaf repositories.

Currently, 26 of the 30 active repositories run an old modified version of Virginia Tech software that is not harvested directly by any robot other than the ADT Program's harvester. Two universities (Western Australia and Curtin) use in-house solutions based around their library catalogue and one other (Melbourne) feeds ETDs from their OAI-PMH-compliant EPrints repository, which also holds research articles. The software used by Melbourne was written by Tasmania, which used it as the production repository for six months; however a strange decision by the University Librarian then resulted in a separate installation of the Virginia Tech software. Currently, Tasmania's two repositories mirror each other. Melbourne and Tasmania are the only two universities whose theses are directly harvested by general and scholarly search engines, though they may be shortly joined by others. The software situation is shown in Figure 2.

Pie chart showing Leaf repository software used

Content in the Repositories

The theses deposited with the ADT Program were analyzed by university (Figure 3), by searching ADT on 'university' and restricting output to open-access ETDs (in other words excluding metadata-only records). There are a couple of surprising high bars in the chart.

Bar chart showing total numbers of ADT deposits by university

The six that exceed 300 ETDs (an arbitrary level) do not all appear to be obvious candidates. The league table of documents and the corresponding number of 2003 research degree completions is shown in Table 1; the rank is also based on the 2003 degree completions nationwide.

Table 1 – Ranking of universities with highest ADT depositions

University Theses in ADT Completions per year Completions rank
University of Western Sydney 730 128 18
Curtin University of Technology 556 217 9
Griffith University 425 167 13
University of New South Wales      412 332 4
Queensland University of Technology 343 175 11
University of Melbourne 326 694 1
Total universities data 5194 6329 39

On analysis, the higher number of theses in these universities appears to derive from a higher rate of investment in digitizing theses that were first accepted and stored in paper form. A later section documents this behavior. This is to be welcomed, but it does not reveal how the universities are adapting to the 21st century and the Internet, or dealing with the imperatives that the open access movement imposes. Accordingly, a more detailed analysis of current deposition rates was undertaken.

The Australian Government collects data on the completions of graduate research degrees (Ph.D. and Master by research), which is publicly available (AVCC, 2005). It is therefore an easy matter to compare this with what is actually happening in the ADT gateway, except that the data is currently available only up to 2003. Accordingly 2003 was chosen as the benchmark year against which to compare ETD deposits. Surprisingly, Table 2 shows that so far only an estimated 12% of the current 2005 Australian thesis production has ended up as ETDs in the ADT system. What is the problem?

Table 2 – Overall success rate in attracting deposits

2003 Completions
2005 Theses in ADT
6329
734
 
12%

Policies

It is probable that the problem is the same one that is the bane of research article repositories. If graduates are left to voluntarily contribute an ETD copy of their thesis, few do, and apparently usually at a level at or below 15% (Hajjem et al. 2005). Though the extra work to provide an electronic copy is trivial and the thesis is almost always produced electronically anyway, this is just avoidable work and consequently it is avoided (Swan & Brown, 2005). The reasons for trying to capture all theses at the point of submission are worth restating:

  • Once the graduate leaves the university, the chance of capturing an electronic copy thereafter is small, due both to human nature (loss of interest) and loss of the file during lifestyle and job changes (if it is even saved).
  • Failure to capture a digital copy of a thesis is forever. A subsequently scanned copy (consisting of page images) is not searchable by search engines or other digital software, and the indexability is lost.
  • A key reason is to achieve research impact. A thesis whose full text is not on-line and searchable has a very small chance of being cited or even read. The research may as well not have even been done, except for the graduation.

Accordingly it was decided to analyze thesis accumulations, comparing them with the known completions, and record whether deposit of an ETD was voluntary or mandatory (required in a Rule or Regulation). Twelve Australian universities had mandatory policies at the time of writing as shown in Table 3.

Table 3 – State of mandatory policies in Australia

University Mandatory policy
Griffith University Submissions after Jun 2002
Murdoch University Submissions after 2003
University of Wollongong Submissions after Jun 2003
University of Queensland Submissions after 2005
University of New South Wales Submissions after 2005
Queensland University of Technology Submissions after Jun 2005
University of Southern Queensland Enrolments after 2001
Central Queensland University Enrolments after 2002
University of Western Australia Enrolments after 2003
Curtin University of Technology Enrolments after 2005
Swinburne University of Technology Enrolments after 2005
Monash University Enrolments after Jul 2005

It will be noted that in six cases this policy applies to all submissions after a given date and thus have immediate effect. Such policies are rated as having a potential effectiveness of 1.0 (or 100%) after their cut-in date. Another six universities have chosen to have the policy apply to all candidates who enrolled after a given date. This causes the effectiveness to increase gradually with time. Few Ph.D. candidates complete in three years; by four years some but not all of the full-time candidates have completed, and eight years is the normal maximum for part-time candidature, but even this may be extended by suspensions. Master results appear more quickly. Accordingly, a model-based effectiveness is assigned to all enrolment date policies in force. The modeling is described in a later section.

Figure 4 shows the actual deposits into ADT in 2005, derived from searching the database by university and graduation date = '2005' again restricted to open-access ETDs, expressed as a percentage of 2003 completions (graduations). These are derived from the Australian Government's data on Ph.D. and research Master completions for 2003. The data was collected in March 2006, by which time all point-of-capture ETDs should have been deposited.

Bar chart showing deposit success for 2005

The results are the same as they are for research articles: a mandatory deposit policy pays off handsomely. Or to put it into a 2 line octosyllabic aphorism:

'Deposit voluntarily,
See empty repository.'

All five universities with effective policies in this chart (Griffith, Murdoch, Queensland University of Technology, Southern Queensland and Wollongong) show that a mandatory policy creates the climate for compliance, and is not overly resisted (Swan & Brown, 2005). Indeed all of the universities that have a deposit rate greater than 20% have a mandatory policy. Quite striking is the effect of QUT's policy, which only took effect in 2005; however, there is probably some synergy from a parallel research article self-archiving policy.

The results are quite clear: a mandatory policy causes deposits to rise to at least 50-80%, compared to the general voluntary policy rate of 5-15%. It is expected that over time a mandatory policy would rise towards 100% deposit, but attitudinal change takes some time.

Given this data it is surprising that the Australian Government has not specified that candidates who received a government-funded scholarship (APA) must deposit an electronic copy of their thesis with the degree-granting university. They are aware of the issue, but choose to not become involved in the maximizing the outcomes of taxpayer funds. Of course, deposit does not mean 'open access' – it simply means that the ETD describing the research is captured in electronic form, with its metadata. Issues relating to subsequent exploitation by publication or commercial use are not affected.

Modeling of Enrolment-date Policies

In a previous section, the effective implementation rates of the two mandatory policies were mentioned. Here, the figures chosen as the probable target of a mandatory policy are justified.

If a mandatory policy applied to all thesis submissions after a given date, this is assumed to bring a 100% target into immediate effect. Such policies are rated as having a potential effectiveness of 1.0 (or 100%) relative to completions, if the decision year is the same as or predates the year under examination (in this paper 2004).

Other universities have chosen to have the policy apply to all candidates who enrolled after a given date, usually the date of the decision. This causes the effectiveness to increase with time. Data on the time distribution of completions was not readily available across Australia. Three years of completions (2003-2005 comprising 314 Ph.D.s and 82 Masters) from the University of Tasmania were therefore analyzed and tracked back to their enrolment date (Folvig, 2005). Figure 5 shows the distribution of the elapsed year of completion for these graduates.

Bar chart showing completion time

There is no reason to suspect that the University of Tasmania is substantially different from other Australian universities, and accordingly, these distributions are used to model the effect of mandatory policies dated from enrolment.

The available evidence suggests that voluntary policies lead to 12% compliance (the ADT average over Australia, also close to the 15% average for research article repositories, see Hajjem et al. 2005). Accordingly a realistic target T for an enrolment-date policy was computed from

T = (ƒi × 1.0) + ((1 – ƒi) × 0.15) = 0.15 + (ƒi × 0.85)
where ƒi is the fraction of candidates who will have completed in the year i

This assumes that all candidates for whom the policy has cut in are recorded at 100%, while the other candidates with enrolment dates preceding the policy change will voluntarily deposit at 15%. This function is graphed for Master and Ph.D. degrees in Figure 6. It will be seen that an enrolment-date policy does not reach 80% effectiveness (for either kind of degree) until five years after the decision is made!

Bar chart showing the potential capture rates

The value of T for 2004 for all universities with an enrolment-date policy is shown in Table 4.

Table 4 – Target effect of mandatory policies

University Policy date T
Central Queensland University 2002 0.46
Curtin University of Technology 2005 0.15
Monash University July 2005 0.15
University of Southern Queensland 2001 0.68
Swinburne University of Technology June 2005 0.15
University of New South Wales 2005 0.15
University of Western Australia 2003 0.18

Costs of delays in instituting fully effective mandatory policies can also be estimated, although it might be better to describe them as liabilities, which will be called to account if 100% deposition rate is considered desirable (this should be indisputable) and made retrospective.

To meet such a target, theses that had not been deposited as electronic copies would need to be scanned and uploaded. It is assumed that a retrospective attempt to capture electronic copies of old theses would have only small success, since academic and graduate record keeping and response rates are likely to be poor. A recent ProQuest study estimates the average time required for scanning at 2hr (Talmacs, 2005), and the average cost at $130/thesis (estimate, unpublished correspondence).

  • Universities with already effective mandatory policies have no predictable liabilities in meeting such a target.
  • Universities with voluntary policies will incur an estimated liability of

       (0.85 x n) x $130

    for every year that they do not bring in a mandatory policy, where n is the number of annual completions. For example, for the University of Sydney (n = 550 in 2003), this amounts to $A60 800/year. The choice of a university to use as a hypothetical example is simply that of a large university with a current voluntary policy.
  • A university that adopts an enrolment-date policy instead of a submission-date policy incurs a liability through the ramping-up delay, which can also be computed using the model in this paper. Taking nP and nM as the number of annual completions for Ph.D. and Master degrees respectively, this liability is

       ((3.06 x nM) + (3.84 x nP)) x $130

    equivalent to about 3.6 years of a voluntary policy. To again use the University of Sydney as a hypothetical example, this amounts to an unnecessary liability of $A218 800. These liabilities will almost certainly devolve to the University Library of the relevant institution.

Retrospectivity

It was asserted earlier in the paper that the current thesis counts in ADT were skewed by retrospective deposits. Accordingly, the data for the University of Western Sydney and Curtin University of Technology were analyzed in detail. These are the two largest numerical depositors in the ADT program but they rank 18th and 9th in the Ph.D.-producing universities. The results of the analysis are shown in Figure 7.

Since the ADT Program was only opened up to general deposition in July 2000, all theses from 2000 and before (and some thereafter) are derived from retrospective scanning or serendipitous capture at an earlier time. The fraction is significant. Deposit rates across other universities also suggest that there is a relatively high degree of retrospectivity and a correspondingly lower level of current ETD capture. The success in gaining thesis content is disguising a comparative failure to collect born-digital ETDs at the point of final lodgment.

Bar chart showing deposits by thesis year

Consequences

For universities

  1. Universities that establish an ETD repository seem to be wasting their money if they maintain a voluntary deposit policy. Deposits are poor, running at most at 12% to 20%.
  2. Mandatory policies pay off handsomely in capturing all or most theses.
  3. Mandatory policies established from date of submission are 5-6 years faster in achieving 80% compliance than policies dated from enrolment. All new policies should specify that the policy applies from the date of submission of the final thesis. If there are any quasi-legal quibbles, the Library should specify that it requires both paper and electronic copies of the thesis to be deposited under the current rules, which are generally so old that they do not specify format explicitly.

For the Australasian Digital Theses Program

  1. As the ADT Program is aware, it must now attract the missing universities in Australia, accounting for 12% of the nation's graduate research completions, and the other seven New Zealand universities.
  2. However, a much more important challenge for ADT is the same one that faces operators of institutional research repositories: to increase the deposit rate from an average of 12% of possible thesis production to 100%. This suggests that the ADT Program should adopt a strong advocacy role regarding the adoption of mandatory thesis submission policies.

For the Australian Government

  1. The Australian Government could accelerate the development of mandatory submission policies by a simple administrative action, costing nothing. If the guidelines for Australian Postgraduate Awards (APAs) were slightly amended to require a completing graduate to deposit both a paper copy and an electronic copy of his or her thesis with the university from which they graduate, the deposit rate would jump immediately, and universities would rapidly insist on the same action by their graduates not in receipt of these awards. This is a simple consequence of the 'Backing Australia's Ability' policy.

Acknowledgments

The author would like to thank:

  • Ian Mitchell in Research Services and Vanessa Folvig in the Graduate Research Unit of the University of Tasmania, for quickly providing requested data;
  • Tony Carneglutti and Kerrie Talmacs at the University of New South Wales for ADT Program data, and for the Program gateway itself – the subject of this research.

This work was carried out in the School of Computing, University of Tasmania.

References

AuseAccess (2005). Australian Open Access Wiki available at <http://leven.comp.utas.edu.au/AuseAccess/>.

Australian Digital Theses Program (2005). Gateway available at <http://adt.caul.edu.au/>.

Australian Vice-Chancellors Committee (2005). HERDC Time Series Data 1992-2003. Available at <http://www.avcc.edu.au/documents/publications/stats/HERDCTimeSeriesData1992-2003.xls>

Folvig V, University of Tasmania (2005). Unpublished communication regarding completion data at the University of Tasmania.

Hajjem C, Harnad S, Gingras Y (2005). Ten-Year Cross-Disciplinary Analysis of the Growth of Open Access and Its Effect on Research Citation Impact. IEEE Data Engineering Bulletin 28(4) pp. 39-47. Accessed at <http://eprints.ecs.soton.ac.uk/11688/01/ArticleIEEE.pdf>.

Swan A and Brown S. (2005) Open access self-archiving: An author study. <http://www.keyperspectives.co.uk/openaccessarchive/reports/Open%20Access%20II%20
(author%20survey%20on%20self%20archiving)%202005.pdf
>,
<http://eprints.ecs.soton.ac.uk/10999/>,
<http://cogprints.org/4385/>,
<http://www.jisc.ac.uk/uploaded_documents/Open%20Access%20Self%20
Archiving-an%20author%20study.pdf
>.

Swan A and Brown S (2004). ISC/OSI Journal Authors Survey Report. <http://eprints.ecs.soton.ac.uk/11002/>,
<http://cogprints.org/4125/>,
<http://www.jisc.ac.uk/uploaded_documents/JISCOAReport1.pdf>.

Talmacs K, University of New South Wales (2005). Unpublished communications regarding costs of thesis scanning.

Copyright © 2006 Arthur Sale
spacer
spacer

Top | Contents
Search | Author Index | Title Index | Back Issues
Editorial | Previous Commentary | First Article
Home | E-mail the Editor

spacer
spacer

D-Lib Magazine Access Terms and Conditions

doi:10.1045/april2006-sale