Error Code: 62:324 Magic Number of the OML does not match

Article ID: MM0039 (CRITICAL ERROR) Tape media job fails with On Media Label (OML) Magic Number error. Possible Data Loss.

Symptom

"Magic Number" refers to the tape's unique ID written to the On Media Label (OML). The "does not match" refers to the OML's unique ID and the physical label (barcode). This relationship is maintained in the CommServ database. OML verification ensures that the correct tape is loaded for read/write. OML verification of a tape occurs at mount and unmount time.  

When performing Backup, Restore, or AuxCopy the error: "Magic Number of the OML does not match" is reported in the Job Controller, Job Details, and Event Viewer.  

 

Example:

Error Code: [62:324] Description: Failed to mount media with barcode [ , side [ ], into drive [ , in library [ ] on MediaAgent [ . Reason: Magic Number of the OML does not match. Advice: Data on media may be corrupted. Media will be marked bad.

Error: Failed to copy or verify chunk [ ] in media [ ] for storage policy [ ] copy [ ]: Backup job [  ]. Magic Number of the OML does not match.

 

Examining the logs you may see entries similar to the following:

JobManager
----------------
4028 70c [date/time] 393548 Scheduler Set pending cause [Error occurred while processing chunk [ ] in media [ ] for storage policy [ ] copy [ ]: Backup job [ ]. Magic Number of the OML does not match..]::Client [ ] Application [AuxCopyMgr] Message Id [218103935] RCID [9185406] ReservationId [0]. Level [0] flags [0] id [0] overwrite [0] append [1] CustId[0].

AuxCopyManager
-----------------
1192 1598 [date/time]  393548 AuxCopyManager::handleFailReport Source < > Target < >: AuxCopy binary on media agent encountered error [16] MM error [776] when reading chunk [ ]: [Magic Number of the OML does not match.]

AuxCopy
------------------
2932 1384 [date/time]  393548 Data Reader open error for AG [ ] Dest copy [ ] Chunk [ ]: Magic Number of the OML does not match.

MediaManager
------------------

6452 18cc [date/time] ##### MOUNT [213] ERROR:(verify OML) DUE TO ** [Magic Number of the OML does not match.](776) ** FUNC[(MLMReserveMountVol::handleWrongSideMountedAndOtherOMLError)(MLMTapeMount.cpp:2955)] PARAMETERS: LIBRARY [ ] DRIVE [ ] MEDIA [ ]MEDIAGRP[ ] DRIVEPOOL [ ] DRIVEHOST [  LIBRARYHOST [ ] LMS [ ] DMS [ ]
6452 18cc [date/time]  ##### MOUNT [ ] Marked media [ ] bad with reason as OML mismatch due to error [776, Magic Number of the OML does not match.]
6452 588 [date/time]  ##### MOUNT [55 ] ERROR:(verify OML) DUE TO ** [Magic Number of the OML does not match.](776) ** FUNC[(MLMReserveMountVol::handleWrongSideMountedAndOtherOMLError)(MLMTapeMount.cpp:2955)] PARAMETERS: LIBRARY [ ] DRIVE [ ] MEDIA [ ]MEDIAGRP[ ] DRIVEPOOL [ ] DRIVEHOST [ ] LIBRARYHOST [ ] LMS [ ] DMS [ ]
6452 588 [date/time]  ##### MOUNT [55 ] Marked media [ ] bad with reason as OML mismatch due to error [776, Magic Number of the OML does not match.]
6452 18cc [date/time]  ##### MOUNT [ ] Incremented error counters for drive [ ] and media [ ] due to error [776, Magic Number of the OML does not match.]

Cause

A common source of the problem is Windows Removable Storage Management (RSM) Service. You should stop and permanently disable the Removable Storage Management Service on all Windows MediaAgents and any other Windows Systems that have access to the tape library(s). 

The issue can also be caused by any intrusive application that has the ability to send Load/Rewind/SCSI Resets to the tape drives. When this type of event occurs outside of CommVault Software it causes the drive to rewind to beginning and if a Data Protection or AuxCopy job is running at the time the OML will be overwritten with data from the current job that was using that drive.

Sometimes tape drive monitoring applications can also cause this problem. For example on HP-UX MediaAgents a default application called dm_stape which runs in the background on HP-UX systems can cause the problem.  On Linux MediaAgents, Veritas Storage agent software (hbaapp is the process) can cause the problem.

There have also been reported cases of hardware issues causing this issue. For example there was a case that the firmware of the drive was the problem:

(example: ADIC Scalar 1000 Library with Quantum SDLT320 Drives. Drives were going into a debug mode when more than one query/request was being sent to the drives, which should normally not have been an issue. Customer needed firmware version v87 on Quantum drives)

Resolution

To prevent further data loss

  1. Suspend all Active Jobs using Tape Libraries
    1. In Job Controller window, select all jobs using tape libraries
    2. Right-click and select Multi-job Control.
    3. Select Suspend - All Selected Jobs.
    4. Click OK.
  2. Disable use of all Tape Libraries until cause of problem has been determined and corrected.
    1. For each Tape Library, open Properties dialog box.
    2. Select Status Tab
    3. Clear Enable Library option
  3. Enable SCSI-3 reservation for each Tape Library.

Note: Older Tape devices may not support SCSI-3 reservation. Consult with your hardware vendor before proceeding:

    a. For each Tape Library open Properties dialog box.
    b. Select the Drive Tab
    c. In the SCSI Reservation group box, select Use SCSI-3 Reserve for contention resolution.

2. Check for any non-commvault tape manipulation/monitoring software that could potentially access or write to a tape.Only CommVault Software MediaAgents should have visibility to the tape drives in a SAN environment.  See causes above for known problems.

To Verify the On Media Label (OML)  

  1. Identify the barcode of the media in question
  2. On a MediaAgent with access to the library containing the media, open a command prompt and navigate to the CommVault software \Base directory.
  3. Login with to the Commvault Command Line Interface
  qlogin
  1. Verify the OML using the media barcode.
   qmedia verify –b      

The command may return one of the following responses.

None (Command hangs)

MediaAgent Services are stopped.  Restart MediaAgent Services and retry the command.

Error 0x303: Invalid barcode name

Barcodes are case sensitive.  Verify the barcode and retry the command

Failed to verify media.

Media is external or cannot be loaded into a drive.  Verify that 

  1. The media is in the library.
  2. MediaAgent services are running on the host.
  3. The library is accessible from the host.
  4. At least one drive is active and unoccupied.

Retry the command

Media in library [ ], slot [ ] does not have a valid media label which means that it was not written to by this product.

OML is missing or corrupt or the media is a new or erased tape.  Verify you have the correct barcode and retry the command.  If the error persists, then OML verification has failed.

Media in library [ ], slot [ ], has Unique [ ], creation time [ ], CommCell Number [ ] and belongs to this instance of the CommCell

The media has a valid OML.  If the media still reports a OML Magic Number error when used by a CommCell Job, contact your Commvault support vendor for assistance.

  1. If the OML verification fails, most likely any data on the media is unrecoverable. However, if you still want to attempt to recover any data that may exist on the media, contact your Commvault support vendor for assistance.
  2. If the OML verification fails and you want to enable re-use of the media, you can delete the media content and erase the tape for re-use. See Books On Line:

Delete Contents on Media

Erase Spare or Retired Media

  1. If source of error has been corrected, resume jobs to tape libraries.
    1. In Job Controller window, select all jobs using tape libraries
    2. Right-click and select Multi-job Control.
    3. Select Resume - All Selected Jobs.
    4. Click OK.