AIMS: This study explored the suitability of the Alcohol Use Disorder Identification Test (AUDIT) for cross-national comparable estimates of problem drinking in general populations. On the item level the focus is on responsiveness to cross-national and gender differences. For the set of items the focus is on intercorrelations between items, indicating to what extent the AUDIT constitutes a scale. METHODS: General population surveys from nine European countries were included. Cross-tabulations were used to analyse cross-national and gender differences in scores on the items. Reliability analysis was used to analyse intercorrelations between the items. RESULTS: The items 'blackouts' (men and women) and 'guilt and remorse' (women) are the most frequently reported consequences. Gender differences tended to be smaller for 'guilt and remorse' and 'concern of others', and largest for 'morning drinking'. The reliability analysis shows that in eight of the nine countries frequency of drinking lowers the alpha. Injury and concern of others lead to a lower internal consistency in three countries. CONCLUSIONS: There was sufficient variation between countries in the pattern of responses and variation in gender differences to conclude that the set of consequence items was responsive to national and gender differences in problem drinking. Frequency of drinking was not a good indicator of problem drinking. The country differences in item total correlations of consequences might be due to differences in how these items are interpreted. Decisions on which items to include in an instrument to allow comparison of estimates of problem drinking cross-nationally require studies on how these items are interpreted in general populations of different countries.