Use this tool to find Catalan words in SUBTLEX-CAT that match certain lexical parameters (frequency, length, etc.).

It tries to make the search for experimental material easier to the researcher.


Number of letters: Matches:

To look for matches two wildcards can be used:

'_' - Indicates that the character can be replaced by any letter (e.g., searching 'p_ay' would lead to 'play' or 'pray', etc.).

'%' - Indicates that this character can be replaced by any sequence (e.g., start with 'cr%ed' would lead to 'created' or 'crowded', etc.).

Relative frequency (per million):
Exactly:

Search for words with a specific length.

If you fill all fields to search by number of letters, the search for the exact number prevails over the search for minimum and maximum length.

The valid range of values is between 1 and 34 letters. The outliers are truncated to the nearest valid value.

Beginning with:

Search for words that start with a given sequence of letters.

It is important not to leave blanks in this box. The program distinguishes between stressed and unstressed words and is case sensitive.

The length of the sequence to search cannot be longer than 6 characters.

Exactly:

Search for words with a specific frequency of occurrence per million.

If all frequency search fields are filled the search for the exact value prevails over the others.

Since the values of relative frequency include several decimals, an exact search would yield few results and therefore would not be practical.

For this reason the search function works with integer values. For example, a search with a relative frequency of 20 will display words with a relative frequency between 20 and 20'99.

Equal to or longer than:

Search for words with length equal to or longer than specified.

The valid range of values is between 1 and 34 letters. The outliers are truncated to the nearest valid value.

The default value is 1.

Containing:

Search for words containing a particular sequence of letters.

It is important not to leave blanks in this box. The program distinguishes between stressed and unstressed words and is case sensitive.

The length of the sequence to search cannot be longer than 6 characters.

Equal to or greater than:

Search for words with a frequency of occurrence per million higher than the specified.

You can specify decimal values using the "." as a delimiter. Any other delimiter (e.g. , or ') will override the search criteria.

Equal to or shorter than:

Search for words with length equal to or shorter than specified.

The valid range of values is between 1 and 34 letters. The outliers are truncated to the nearest valid value.

The default value is 34.

Ending in:

Search for words that end in a certain sequence of letters.

It is important not to leave blanks in this box. The program distinguishes between stressed and unstressed words and is case sensitive.

The length of the sequence to search cannot be longer than 6 characters.

Equal to or less than:

Search for words with a frequency of occurrence per million lower than specified.

You can specify decimal values using the "." as a delimiter. Any other delimiter (e.g. , or ') will override the search criteria.