Amazon Macie

    Once Macie begins monitoring your data, it uses several

    to identify and prioritize your sensitive and critical data and to accurately assign business value to your data. Each classification has

    a designated risk level between 1 and 10, with 10 being the highest risk and 1 being the lowest.

    These methods include:

  • Content Type Classification

    – Macie uses an identifier that is embedded in the file header of your data objects. Macie can assign only one content type to an object. You can’t modify existing or add new content types. You can only enable or disable any existing content types, thus enabling or disabling Macie to assign them to your objects during the classification process.

  • File Extension Classification

    – Macie offers a set of managed file extensions. Macie can assign only one file extension to an object. You can’t modify existing or add new file extensions. You can enable or disable any existing file extensions, thus enabling or disabling Macie to assign them to your objects during the classification process.

  • Theme Classification

    – Object classification by theme is based on keywords that Macie searches for as it examines the contents of data objects. Macie can assign one or more themes to an object. You can’t modify existing or add new themes. You can enable or disable any existing themes, thus enabling or disabling Macie to assign them to your objects during the classification process.

  • Regex Classification

    – Macie offers a set of managed regexes. Object classification by regex is based on specific data or data patterns that Macie searches for as it examines the contents of data objects. Macie can assign one or more regexes to an object. You can’t modify existing or add new regexes. You can enable or disable any existing regexes, thus enabling or disabling Macie to assign them to your objects during the classification process.

  • PII Classification

    – Object classification by personally identifiable information (PII) is based on recognizing any personally identifiable artifacts based on industry standards such as NIST-80-122 and FIPS 199.

  • Support Vector Machine–Based Classifier

    – It classifies content inside your S3 objects (text, token n-grams, and character n-grams) that Macie monitors and their metadata features (document length, extension, encoding, headers) to accurately classify documents based on content.