Registration Dossier

Data platform availability banner - registered substances factsheets

Please be aware that this old REACH registration data factsheet is no longer maintained; it remains frozen as of 19th May 2023.

The new ECHA CHEM database has been released by ECHA, and it now contains all REACH registration data. There are more details on the transition of ECHA's published data to ECHA CHEM here.

Diss Factsheets

Physical & Chemical properties

Water solubility

Currently viewing:

Administrative data

Link to relevant study record(s)

Reference
Endpoint:
water solubility
Type of information:
(Q)SAR
Adequacy of study:
key study
Study period:
2019
Reliability:
2 (reliable with restrictions)
Rationale for reliability incl. deficiencies:
accepted calculation method
Justification for type of information:
SOFTWARE: EPI Suite (v 4.11)

MODEL: WSKOWWIN v 1.42

SMILES OR OTHER IDENTIFIERS USED AS INPUT FOR THE MODEL: O=C(OCc1ccccc1)c2ccc(cc2)/N=N/C3=C(C)C(C#N)=C(O)N(CCCC)C3=O
Experimental logPow and melting point were used as input

SCIENTIFIC VALIDITY OF THE (Q)SAR MODEL
A complete description of the estimation methodology used by WSKOWWIN is available in two documents prepared for the U.S. Environmental Protection Agency, Office of Pollution Prevention and Toxics (Meylan and Howard, 1994a,b).
A journal article that describes the methodology is also available (Meylan et al., 1996). The WSKOWWIN program estimates the water solubility of an organic compound using the compounds log octanol-water partition coefficient (log Kow). A brief description is given below.

Data collection
A database of more than 8400 compounds with reliably measured log Kow values had already been compiled from available sources. Most experimental values were taken from a "star-list" compilation of Hansch and Leo (1985) that had already been critically evaluated (see also Hansch et al, 1995) or an extensive compilation by Sangster (1993) that includes many "recommended" values based upon critical evaluation. Other log Kow values were taken from sources located through the Environmental Fate Data Base (EFDB) system (Howard et al, 1982, 1986).  A few values were taken from Section 4a, 8d, and 8e submissions the to U.S. EPA under the Toxic Substances Control Act (see http://www.syrres.com/esc/tscats_info.htm).
Water solubilities were collected from the AQUASOL dATAbASETM of the University of Arizona (Yalkowsky and Dannenfelser, 1990), Syracuse Research Corporation's PHYSPROP© Database (SRC,1994), and sources located through the Environmental Fate Data Base (EFDB) system (Howard et al, 1982, 1986).  Water solubilities were primarily constrained to the 20-25 °C temperature range with 25 °C being preferred.
Melting points were collected from sources such as AQUASOL dATAbASETM,  PHYSPROP©, and EDFB as well as the Handbook of Chemistry and Physics (Lide, 1990) and the Aldrich Catalog (Aldrich, 1992).

Regression & results
A dataset of 1450 compounds (941 solids, 509 liquids) having reliably measured water solubility, log Kow and melting point was used as the training set for developing the new estimation algorithms for water solubility. Standard linear regressions were used to fit water solubility (as log S) with log Kow, melting point and molecular weight.
Residual errors from the initial regression fit were examined for compounds sharing common structural features with relatively consistent errors. On that basis, 12 compound classes were initially identified and added to the regression to comprise a multi-linear regression including log Kow, melting point and/or molecular weight plus 12 correction factors. Each correction factor is counted a maximum of once per structure [if applicable], no matter how many times the applicable fragment occurs. For example, the nitro factor in 1,4-dinitrobenzene is counted just once. A compound either contains a correction factor or it doesn't; therefore, the matrix for the multi-linear regression contained either a 0 or 1 for each correction factor.  

WSKOWWIN estimates water solubility for any compound with one of two possible equations. The equations are equations 19 and 20 from Meylan and Howard (1994a) or equations 11 and 12 from the journal article (Meylan et al., 1996). The equations are:
log S (mol/l)  =  0.796 - 0.854 log Kow - 0.00728 MW + ΣCorrections
log S (mol/l)  =  0.693 - 0.96 log Kow - 0.0092(Tm-25) - 0.00314 MW + ΣCorrections

where MW is molecular weight, Tm is melting point (MP) in deg C [used only for solids] ...When a measured MP is available, that equation is used; otherwise, the equation with just MW is used.


5. APPLICABILITY DOMAIN
WSKOWWIN estimates water solubility with one of two possible equations. When an experimental melting point is available, WSKOWWIN applies the equation containing both a melting point and the molecular weight (MW) parameters. In the absence of a melting point, the equation containing just the molecular weight is used to make the estimate. All compounds in the 1450 compound training set have known melting points or are known to be liquids at 25 °C.  The accuracy statistics for the two equations are as follows:

Melt Pt + MW MW only
r2 0.970 0.934
std dev. 0.409 0.585
avg dev. 0.313 0.442

Validation
The WSKOWWIN estimation equations were initially validated on two datasets of compounds that were not included in the model training. A relatively small dataset was tested that consisted of 85 compounds having experimental log Kow values, but no available melting points. Many compounds in the 85 compound test set decompose before melting and would theoretically have very high melting points (e.g. amino acids and compounds having multiple nitrogens). The accuracy statistics for the equation used by WSKOWWIN are:
number 85
r2 0.865
std dev. 0.961
avg dev. 0.714

A much larger dataset of 817 compounds was also tested. All 817 compounds had experimental melting points, but none of the 817 compounds had a reliable experimental log Kow. The log Kow values used for the validation-testing were estimated (primarily using the KOWWIN program available at that time); therefore, the water solubility estimates are based on estimates for log Kow. Typically, estimates based on estimates reduce estimation accuracy, but this type of validation can provide insight into the ability of the method. The accuracy statistics for this dataset are:
number 817
r2 0.902
std dev. 0.615
avg dev. 0.480

Availability of training & validation datasets
The complete datasets used to train and validate the SAR equations used by the WSKOWWIN program are available in two documents prepared for the U.S. Environmental Protection Agency, Office of Pollution Prevention and Toxics (Meylan and Howard, 1994a,b).  These documents, which also detail the estimation methodology, can be downloaded from the Internet at:
http://esc.syrres.com/interkow/EpiSuiteData.htm

Substructure searchable formats of the data can be download at:
http://esc.syrres.com/interkow/EpiSuiteData_ISIS_SDF.htm

Estimation domain
The WSKOWWIN program applies an individual correction factor only once per structure [if at all] regardless of how many instances of the applicable structural feature occur in the structure. The minimum number of instances is zero and the maximum is one.

Range of water solubilities in the training set:
Minimum = 4 × 10-7 mg/l (octachlorodibenzo-p-dioxin)
Maximum = completely soluble (various)

Range of molecular weights in the training set:
Minimum = 27.03 (hydrocyanic acid)
Maximum = 627.62 (hexabromobiphenyl)

Range of log Kow values in the training set:
Minimum = -3.89 (aspartic acid)
Maximum = 8.27 (decachlorobiphenyl)

Currently there is no universally accepted definition of model domain. However, users may wish to consider the possibility that water solubility estimates are less accurate for compounds outside the MW range, water solubility range and log Kow range of the training set compounds. It is also possible that a compound may have a functional group(s) or other structural features not represented in the training set, and for which no correction factor was developed.  These points should be taken into consideration when interpreting model results.

 
Principles of method if other than guideline:
data derived from assessment
GLP compliance:
not specified
Type of method:
other: data derived from assessment
Water solubility:
0.024 mg/L
Conc. based on:
test mat.
Temp.:
25 °C
Remarks on result:
other: prediction
Conclusions:
The water solubility estimation of the test item was calculated to be 0.024224 mg/l at 25 °C.
Executive summary:

The water solubility of the test item was estimated to be 0.02422 mg/l using EPI Suite (US-EPA, 2004).

Description of key information

Water solubility = 0.02422 mg/l

Key value for chemical safety assessment

Water solubility:
0.024 mg/L

Additional information

The water solubility of the test item was estimated using EPI Suite (US-EPA, 2004).

Such data was comparable to the water solubility of a structural analogue, i.e. Similar Substance 01, with a water solubility less than 0.06 mg/l (limit of detection) (OECD guideline 105, 2006, and EU method A.6, 1992). Details on Similar Substance 01 are available in section 13.

Available experimental data on a commercial preparation was reported for completeness sake. However, such value was measured on a sample with a low content of active substance as well as a high content of dispersing agents.

The water solubility was found to be 10–50 g/l.